Clicky

🐪 Reformed JAPHs: Alphabetic Indexing

Note: This post was edited for clarity. Many years ago, I was a Perl programmer. Then one day I became disillusioned at the progress of Perl 6 and decided to import this. This seems to be a fairly common story for Perl to Python converts. While I haven’t looked back much, there are a number of things I really miss about perl (lower case intentional). I miss having value types in a dynamic language, magical and ill-advised use of cryptocontext, and sometimes even pseudohashes because they were inexcusably weird....

April 1, 2011 · Ryan O'Neil

📈 Simulating GDP Growth

I hope you saw “China’s way to the top” on the Post’s website recently. It’s a very clear presentation of their statement and is certainly worth a look. So say you’re an economist and you actually do need to produce a realistic estimate of when China’s GDP surpasses that of the USA. Can you use such an approach? Not really. There are several simplifying assumptions the Post made that are perfectly reasonable....

February 23, 2011 · Ryan O'Neil

🧐 Data Fitting 2a - Very, Very Simple Linear Regression in R

Note: This post was updated to include an example data file. I thought it might be useful to follow up the last post with another one showing the same examples in R. R provides a function called lm, which is similar in spirit to NumPy’s linalg.lstsq. As you’ll see, lm’s interface is a bit more tuned to the concepts of modeling. We begin by reading in the example CSV into a data frame:...

February 16, 2011 · Ryan O'Neil

🧐 Data Fitting 2 - Very, Very Simple Linear Regression in Python

This post is based on a memo I sent to some former colleagues at the Post. I’ve edited it for use here since it fits well as the second in a series on simple data fitting techniques. If you’re among the many enlightened individuals already using regression analysis, then this post is probably not for you. If you aren’t, then hopefully this provides everything you need to develop rudimentary predictive models that yield surprising levels of accuracy....

February 15, 2011 · Ryan O'Neil

🗳 Off the Cuff Voter Fraud Detection

Consider this scenario: You run a contest that accepts votes from the general Internet population. In order to encourage user engagement, you record any and all votes into a database over several days, storing nothing more than the competitor voted for, when each vote is cast, and a cookie set on the voter’s computer along with their apparent IP addresses. If a voter already has a recorded cookie set they are denied subsequent votes....

November 30, 2010 · Ryan O'Neil

🧐 Data Fitting 1 - Linear Data Fitting

Note: This post was updated to work with Python 3 and PySCIPOpt. The original version used Python 2 and python-zibopt. Data fitting is one of those tasks that everyone should have at least some exposure to. Certainly developers and analysts will benefit from a working knowledge of its fundamentals and their implementations. However, in my own reading I’ve found it difficult to locate good examples that are simple enough to pick up quickly and come with accompanying source code....

November 23, 2010 · Ryan O'Neil

🐍 Monte Carlo Simulation in Python

Note: This post was updated to work with Python 3. One of the most useful tools one learns in an Operations Research curriculum is Monte Carlo Simulation. Its utility lies in its simplicity: one can learn vital information about nearly any process, be it deterministic or stochastic, without wading through the grunt work of finding an analytical solution. It can be used for off-the-cuff estimates or as a proper scientific tool....

October 8, 2009 · Ryan O'Neil

⚡️ On the Beauty of Power Sets

One of the difficulties we encounter in solving the Traveling Salesman Problem (TSP) is that, for even a small numer of cities, a complete description of the problem requires a factorial number of constraints. This is apparent in the standard formulation used to teach the TSP to OR students. Consider a set of nn cities with the distance from city ii to city jj denoted dijd_{ij}. We attempt to minimize the total distance of a tour entering and leaving each city exactly once....

February 27, 2009 · Ryan O'Neil

📐 Uncapacitated Lot Sizing

Uncapacitated Lot Sizing (ULS) is a classic OR problem that seeks to minimize the cost of satisfying known demand for a product over time. Demand is subject to varying costs for production, set-up, and storage of the product. Technically, it is a mixed binary integer linear program – the key point separating it from the world of linear optimization being that production cannot occur during any period without paying that period’s fixed costs for set-up....

February 20, 2009 · Ryan O'Neil