Something I love about what I do is getting to see everyone’s data and business strategy. But, because I deal in everyone’s private ideas, I can’t talk about most of what I learn. This is not only frustrating, it also leads to very boring blog posts. And yet, working on my latest project, for once I feel as excited about the process as I do the final product, so I’m going to talk about my process.
A client asked me to put together a program that automates their data gathering and checking. I’ve done similar projects in the past using Python to scrape and process data from my local machine. However, this client wanted their own user-friendly web portal that non-technical employees could use themselves and upload data ad hoc. I’d never built something so complicated without the aid of existing software before, but decided to push ahead with a “from scratch” idea.
To make a long story short, after much trial and error, I landed on a combination of Flask, PythonAnywhere, and Amazon Lambda (and wow am I glad I did). PythonAnywhere is designed specifically to host Python apps, and the back end user interface makes it easy to upload, deploy, and edit source files on the fly. No weird file configs or SSH bash screens or total absence of error messages leaving you wondering why nothing works. Turning on HTTPS doesn’t break your website for no reason. It’s easy to point a CNAME at your app from your domain registrar and have a custom domain name with zero fuss or hard-coded messiness. The point: I’m a big fan. PythonAnywhere is definitely the best hosted server I’ve used.
Flask, too, radically simplified my web app build compared to what I was used to doing in Django. Flask pretty much works right out of the box with a few lines of code. The documentation is reasonably clear, making it simple to fix bugs and build out your app. The genius of Flask is that the default version is so tiny, yet all the extensions are out there waiting to be added as you need them. You need security? Just add the security extension. Session support? Just add that extension. I love that Flask can be incredibly complex, but is only ever as complex as is needed. You never have to wade through a million unnecessary features just to get to the one you want.
Originally, I had the whole script running on PythonAnywhere, and checking 5 web pages at a time worked fine, but when I bumped the request up to 5,000 web pages, that increase killed my app. Lambda allows me to run multithreaded script loops, meaning whether I check 5 web pages or 5,000 web pages I can run every request at the same time and get all the results back quickly and simultaneously.
Its amazing how much more fun coding is when you’re not just secretly crunching data behind the scenes and can actually see it working online.