First Draft?

I think I’ve made enough progress with the coding to have something useful. And everything is using the AngularJS framework, so I’m pretty buzzword compliant. Well, that may be to bold a statement, but at least it can make data that can be analyzed by something a bit more rigorous than just looking at it in a spreadsheet.

Here the current state of things:

The data analysis app: http://tajour.com/iRevApps/irevdb.html

Changes:
  • Improved the UX on both apps. The main app first.
    • You can now look at other poster’s posts without ‘logging in’. You have to type the passphrase to add anything though.
    • It’s now possible to search through the posts by search term and date
    • The code is now more modular and maintainable (I know, not that academic, but it makes me happy)
    • Twitter and Facebook crossposting are coming.
  • For the db app
    • Dropdown selection of common queries
    • Tab selection of query output
    • Rule-based parsing (currently keyDownUp, keyDownDown and word
    • Excel-ready cvs output for all rules
    • WEKA ready output for keyDownUpkeyDownDown and word. A caveat on this. The WEKA ARFF format wants to have all session information in a single row. This has two ramifications:
      • There has to be a column for every key/word, including the misspellings. For the training task it’s not so bad, but for the free form text it means there are going to be a lot of columns. WEKA has a marker ‘?’ for missing data, so I’m going to start with that, but it may be that the data will have to be ‘cleaned’ by deleting uncommon words.
      • Since there is only one column per key/word, keys and words that are typed multiple times have to be grouped somehow. Right now I’m averaging, but that looses a lot of information. I may add a standard deviation measure, but that will mean double the columns. Something to ponder.

Lastly, Larry Sanger (co-founder of Wikipedia) has started a wiki-ish news site. It’s possible that I could piggyback on this effort, or at least use some of their ideas/code. It’s called infobitt.com. There’s a good manifesto here.

Normally, I would be able to start analyzing data now, with WEKA and SPSS (which I bought/leased about a week ago), but my home dev computer died and I’m waiting for a replacement right now. Frustrating.

Leave a comment