Logistic Regression in Python – Simple Example

A great simple example on how to deal with Logistic Regression in Python utilizing Matplotlib, Seaborn, and Scikit-learn. The data is pulled from Kaggle.com and found here. Normally this comes as a test and train set, but I'm analyzing the training set only to see the accuracy of the model. The data here is a little dirty and necessitated some data cleansing first, probably worthy of another short tutorial / post in terms of some other methods of initial cleansing before working on data... ...
Read More

Modular Sprite-sheets: Pygame

I normally don't share little strange bits of unfinished code - but I thought this was pretty cool. I was messing around with coding a little "fish tank" game in Python utilizing Pygame. One thing I'm bad about doing is attempting to make everything modular - which is why my Text Adventure game turned out to be a Text Adventure Engine. I wanted to utilize Pygame, but I wanted to load in sprite sheets in a predictable manner. In this case I found some sprites that are generally utilized for RPG Maker that always contain a predictable pattern (Multiple sprites on the same sheet, differing only by color). Given that there are so many available due to its popularity, it also makes things easy to play with. I believe there are licensing problems here, but this is just for personal use as an example. You could utilize any sheets that are uniform in size. Here's a little example of one...
Read More
Python Tips/Tricks
Warning: Trying to access array offset on value of type bool in /home/public/wp-content/themes/square/inc/template-tags.php on line 138

Analytics Use Case: Financial data with Pandas, Matplotlib

This is another great little exercise I completed that shows some basic to advanced analysis of stock ticker data. The instructions were written several years ago so the dataset used no longer matches when I pull today. This means that some of the charts are not showing the intended data, and the authors comments on peaks/lows may not be seen. It's hard to say why the data they originally used is different since I don't have it. However - this really isn't about the data anyway, but more of how to perform some more advanced analytics and visualizations with Python. ...
Read More
Python Tips/Tricks
Warning: Trying to access array offset on value of type bool in /home/public/wp-content/themes/square/inc/template-tags.php on line 138

Including Jupyter Notebooks on WordPress

UPDATE 2020/08/10: I believe there is a more reliable method here. The method below still does work, but I've encountered too many problems to continue on using it at the moment... This really isn't a standard Python tip, but interesting to those folks running WordPress sites that may want it AND are hitting problems doing so. You can end up embedding a Jupyter notebook out of github directly (sort of) and it looks wonderful - see post below for an example: https://www.mikekale.com/analytics-use-case-basic-pandas-i-o-matplotlib-seaborn/ There is a great little post here on how to do it: https://www.eg.bucknell.edu/~brk009/notebook-on-wp/ Ultimately it points to the author of a little plugin here: https://www.andrewchallis.co.uk/portfolio/php-nbconvert-a-wordpress-plugin-for-jupyter-notebooks/ These instructions will *probably* work for most of you guys, however I host with Nearlyfreespeech.net, whom is a great hosting company but they are very security conscious. If you're looking for Wordpress/Themes/Plugins to auto update - it's not the spot. Most of my work on the backend is done in SSH, and getting the right permissions/groups/owners on the files and...
Read More
Python Tips/Tricks,
Warning: Trying to access array offset on value of type bool in /home/public/wp-content/themes/square/inc/template-tags.php on line 138

Analytics Use Case: Basic Pandas I/O, Matplotlib, Seaborn

Attached is a great practical exercise I completed that shows you how to perform basic I/O utilizing Pandas, and data visualization utilizing Matplotlib and Seaborn. Along with seeing how to load the data, you can see some great basic functions of utilizing Pandas for data cleansing such as adding columns or groupby's. Several basic lambda functions are also shown, which can be a bit mind bending at first for those who haven't dealt with them before. ...
Read More
Python Tips/Tricks
Warning: Trying to access array offset on value of type bool in /home/public/wp-content/themes/square/inc/template-tags.php on line 138

Integrate Tableau and SSRS – A complete tutorial

One of the most difficult part of Business Intelligence is finding the right tool for the right situation. Too often folks will discover Tableau and use it for every single bit of reporting they need, including data that used to be in Excel. The problem is, they will utilize Tableau as if it was a glorified Excel replacement and try to mimic Excels features. Unfortunately if there is one thing Tableau does horrible, it is display large amounts of raw data. Tableau is great at visualizations, which is the wheelhouse it should stay in. However - there is often a legitimate need to not only get at the raw data, but display it in a "prettier" format. Viewing data within Tableau isn't a great experience. Often it's littered with calculations you don't want to see, and users want you to organize the raw data (maybe add some nice headers). None of this is really possible or practical within Tableau. But -...
Read More
Tableau Tips/Tricks, ,
Warning: Trying to access array offset on value of type bool in /home/public/wp-content/themes/square/inc/template-tags.php on line 138

SQL Server integration with Python

Something very important to me in particular as I utilize SQL Server quite a bit. Utilizing pyodbc and Pandas you can bring your data in. Required Libraries conda install pyodbc #Anaconda installation #pip install pyodbc #Alternative Here we can try altering to read from the AdventureWorks2017 database in SQL Server, as an example (replacing your server_name): import pyodbc import pandas as pd conn = pyodbc.connect('Driver={SQL Server};' 'Server=server_name;' 'Database=AdventureWorks2017;' 'Trusted_Connection=yes;') sql_query = pd.read_sql_query('SELECT * FROM Person.person',conn) sql_query.head() Notice this gets thrown out as a dataframe: Similarly - we could also write back to SQL Server if we wanted to utilizing slightly altered logic (abbreviated insert given how many...
Read More

Data input/output in Python – Pandas

Section breakdown: Import/Write CSVImport ExcelImport HTML (scraping) Import/Write CSV import pandas as pd df = pd.read_csv('example') #import csv df.to_csv('My_output',index=False) #write to csv, don't include the index column Import/Write to Excel import pandas as pd pd.read_excel('Excel_Sample.xlsx',sheet_name='Sheet1') #import Excel df.to_excel('Excel_Sample.xlsx',sheet_name='Sheet1') #write to Excel Import HTML Required Libraries, assuming Anaconda is installed conda install lxml conda install html5lib conda install BeautifulSoup4 In this case I'm going to reference a table found below on the FDIC.gov website: https://www.fdic.gov/resources/resolutions/bank-failures/failed-bank-list/ import pandas as pd data = pd.read_html('https://www.fdic.gov/resources/resolutions/bank-failures/failed-bank-list/') Notice that this reads in every table it can find within the website as a list of dataframes, you can explore the tables it picked up by viewing data[0], data[1], etc. data[0].head() ...
Read More
Python Tips/Tricks
Warning: Trying to access array offset on value of type bool in /home/public/wp-content/themes/square/inc/template-tags.php on line 138

Dashboard Use-Case: Mapping NFL Plays

As part of a paper I wrote, I utilized Tableau to generate a visual representation of plays ran by various NFL players. It turns out to be quite a simple thing to do, and appears more complex than it is. The dataset comes from a Kaggle challenge located here, which provides data for 2 years of NFL seasons and provides detailed X,Y coordinates for each of the plays and players. With this information we can map the points on a standard scatter plot over time, and play connect the dots. In an effort to make things quicker, I parsed down the dataset to only include 2 players and the top 100 plays of each. The original dataset is approximately 76 million records (down to ~50k). With a minimum of X,Y and Time, we plot the Sum(X) and Sum(Y) on Cols/Rows (making sure the X is on the columns). At this point you could add time to detail and you would see each...
Read More
Tableau Tips/Tricks
Warning: Trying to access array offset on value of type bool in /home/public/wp-content/themes/square/inc/template-tags.php on line 138