Big Data Sets

Regarding nothing at all other than my brain is mush, I’ve spent the last few days wrestling with a couple of reasonably large data sets (42K+ records and 46K+ records) and trying to winnow out a much smaller subset of data that’s not explicitly encoded.

In other words, I’m looking for needles in haystacks and trying to be clever.

It’s not rocket science, but it does require some care of you’ll go straight off into the Twilight Zone and have to go back a couple of steps to recover. Sort this, search for that, pull these records, but don’t do it in the wrong order!

My group did this during my MBA about 15 years ago. (It’s astonishing to me that it’s been that long, truly, but yep, double checked, I graduated in 2007.) We had to do an analysis on a data set of our choosing and we chose one from one of my classmates’ companies that was monstrous! I’m not sure we got results on a third of the things we were supposed to be looking for, but we did show what wasn’t there for a bunch of things and got great grades for displaying awesome audacity in even trying.

This project isn’t anything like that big, or audacious, but it is making my eyes and brain bubble and bug out. However, tonight I’ve got some fantastic Jean Michel Jarre on the headphones, so that helps a lot. It also helps that the Angels sort of suck really bad this year, so I don’t feel bad about not watching them.

How was your Monday?

Leave a comment

Filed under Random Blatherationings

Please join the discussion, your comments are encouraged!

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.