This is the fourth in a series of posts charting the progress of a programmer starting out in data science. The first post is A Pilgrim’s Progress #1: Starting Data Science. The previous post is A Pilgrim’s Progress #3: NumPy
I’m trying something new out here. These posts are coded in Jupyter which is an extremely handy way to intermingle text and executable code. It comes with Anaconda, which is the best way to get everything going if you’re starting out. For the first couple I cut-and-pasted the material over to WordPress. This time I downloaded the Jupyter file as HTML and pasted it in. Far from perfect but 100x faster. It’s painful to edit once pasted in, so it’s far from a perfect solution. Any ideas?
Pandas are insanely versatile and capable of far more than I’ve covered in this already excessively long set of notes. At best this is a way to get an idea of how they work and a quick tour of what they look like in use.
Continue reading