Taking Google Sheets to (a) Class.

Taking Google Sheets to (a) Class.

John DeJesusBlockedUnblockFollowFollowingMar 17I am currently building a Flask app for teachers.

Since Google Drive has been adopted by teachers, Google sheets are used by them also.

One of my app’s features is to easily allow teachers to copy and paste the sheet link into the app and submit it through a form.

It will then convert it into a pandas data frame and drop the null rows and columns.

Google Sheets IconIt is a work in progress but I wanted to share the class made so far since I don’t find many Python class examples for data collection or preprocessing.

Many of the examples I have seen are useful for teaching the concept of classes.

But most of the time they are not practical enough or specific to data science or ML.

After all, we could always use more practical examples of Python classes.

See, Yoda knows.

(Self-made on imgflip.

com)But why would you need a class?.Can’t you just write the code and be ok?If this were for analysis purposes, then…yes.

But production code, as far as I know, follows an object-oriented programming approach.

This means that your code is split between scripts.

These scripts are primarily composed of classes and functions.

Code written in this manner is also easier to reuse for additional purposes.

Take the ecdf function I used in this post as an example.

Data Science Code Refactoring ExampleWhen learning to code for data science we don’t usually consider the idea of modifying our code to reap a particular…towardsdatascience.

comWith this function, you can quickly reproduce this plot immediately as opposed to retyping the same matplotlib templating code over and over again.

Great!.Show me your example!Of course!.I will share this example in pieces and explain each part as we go.

# Import necessary librariesimport gspreadimport pandas as pdfrom gspread_dataframe import get_as_dataframefrom oauth2client.

service_account import ServiceAccountCredentialsgspread is the library used to handle the input and manipulation of our Google sheets.

We will use pandas to drop all empty rows and columns in our sheet.

gspread-dataframe is our key library to coverting our sheets into data frames.

Lastly, oauth2client is the library suggested by gspread to authenticate our credentials.

If you are not familiar with this type of authentication you can see how it is performed here.

# Create class name and init functionclass GoogleSheetProcessor: def __init__(self): self.

scope = ['https://spreadsheets.

google.

com/feeds', 'https://www.

googleapis.

com/auth/drive'] self.

credentials = ServiceAccountCredentials.

from_json_keyfile_name( 'Credentials.

json', self.

scope) self.

gc = gspread.

authorize(self.

credentials)First, we name our class.

By convention, classes are named using the CapWords convention.

From there we create the init function to tell the class to setup variables assigned to the class.

In our case, we will have the class initiate the three variables need to authenticate our credentials.

json file.

# Get google sheet methoddef obtain_google_sheet(self, link): sheet = self.

gc.

open_by_url(link) sheet = sheet.

get_worksheet(0) return sheetHere we create a method (aka function of a class) that will take the Google sheet link as a parameter.

It then uses our instance of self.

gc to take in the link and then take the first worksheet in the Google sheet file.

Finally, it returns the worksheet we want as sheet.

# Convert sheet to data frame and drop unnecessary rows and columns@staticmethoddef worksheet_to_df(worksheet): # adjust input so nan rows and columns are not imported df = get_as_dataframe(worksheet, parse_dates=True, header=0) df.

dropna(axis='columns', how='all', inplace=True) df.

dropna(axis='rows', how='all', inplace=True) return dfOur second method takes a worksheet as input and converts it into a pandas data frame.

It has the @staticmethod class decorator since we do not need to have the class instance self as a parameter to use it.

We then use the Pandas dropna data frame method to remove the rows and columns which don’t contain any values.

Now let us step back and admire the entire class as a single entity.

# Create class name and init functionclass GoogleSheetProcessor: def __init__(self): self.

scope = ['https://spreadsheets.

google.

com/feeds', 'https://www.

googleapis.

com/auth/drive'] self.

credentials = ServiceAccountCredentials.

from_json_keyfile_name( 'Credentials.

json', self.

scope) self.

gc = gspread.

authorize(self.

credentials)# Get google sheet methoddef obtain_google_sheet(self, link): sheet = self.

gc.

open_by_url(link) sheet = sheet.

get_worksheet(0) return sheet# Convert sheet to data frame and drop unnecessary rows and columns@staticmethoddef worksheet_to_df(worksheet): # adjust input so nan rows and columns are not imported df = get_as_dataframe(worksheet, parse_dates=True, header=0) df.

dropna(axis='columns', how='all', inplace=True) df.

dropna(axis='rows', how='all', inplace=True) return dfIt’s possible!.(Self-made from imgflip.

com)So how did use this class?I used it as part of the processing for a Flask form.

Below is the route function that utilizes our class.

@app.

route('/upload/<email>', methods=['GET', 'POST'])@login_requireddef upload(email): user = User.

query.

filter_by(email=email).

first_or_404() form = DataInputForm() if form.

validate_on_submit(): if form.

google_sheet.

data is None and form.

excel_sheet.

data is None: flash('Form Empty.

Please enter a Google Sheet link or load an Excel file') return redirect(url_for('upload')) elif form.

google_sheet.

data and form.

excel_sheet.

data: flash('Cannot Submit Both Form Types.

Only Google Sheet OR Excel File') return redirect(url_for('upload')) elif form.

google_sheet.

data: gp = GoogleSheetProcessor() data = gp.

obtain_google_sheet(form.

google_sheet.

data) data = gp.

worksheet_to_df(data) flash('Upload Successful') return render_template('dashboard.

html', data=data) return render_template('upload.

html', title='Data_Upload', user=user, form=form)If you are not familiar with Flask or Flask forms this may seem confusing.

Let’s focus on the part where we implement our class.

elif form.

google_sheet.

data: gp = GoogleSheetProcessor() data = gp.

obtain_google_sheet(form.

google_sheet.

data) data = gp.

worksheet_to_df(data)After we initiate the class, we then assign the sheet to the variable data.

form.

google_sheet.

data is where the app will obtain the inputted link from the DataInputForm initiated earlier in the route function.

The link is then passed into our worksheet_to_df method to be converted and slimmed down.

I noticed that there is form.

excel_sheet.

data in your route function.

What are you doing with Excel files?This is actually a partially complete route function.

I can’t reveal too much now but there will be functionality for teachers to upload Excel files as well.

I will write more articles related to this class once an MVP version of the app is launched.

Picture by DaMongMan on FlickrOne last thought before we go….

Writing classes can daunting.

If you are studying to be a data scientist or analyst it is not a common concept that comes up.

Even the idea of writing functions can seem scary.

But you can do it and it is a useful skill.

If you can at least build functions you can cut some of the repetitive code writing for your visualizations and ML models.

Even better, if your models are going to be integrated into a product, your engineer will thank you for making their life easier when they add your code to their production code.

Thanks for reading!.I hope this Python class example will help you with building your own data related classes.

Try doing one yourself using a data type of your choice.

Which will you start with first?.The data can be from a CSV or a SQL server.

If you are not yet comfortable with making a class, try to turn those processes into functions first, then work your way up to building an entire class around those functions.

Until next time,John DeJesus.. More details

Leave a Reply