Wednesday, October 14, 2009
J's and U's Updated / Speed Increases
I mentioned a few weeks ago that I planned on making I's/J's and U's/V's look the same on the back-end, while preserving their traditional orthographies on the front-end. I've just completed this task!
My main motivation for making this update is because certain passages stored in The Latin Library reflect the older conventions of using J's for consonantal I's or U's for both consonantal and vocalic V's. Numen's parsing engine was having trouble recognizing forms like jecit (iecit) and uuius (vivus). So now as a result -- after a bit of work -- the engine is updated and now recognizes more possibilities than ever. Incidentally, internally J's are stored as I's and U's are stored as V's.
Another project I completed at the same time is an order-of-magnitude speed improvement for parsing. I was trying to figure out ways to make the engine faster and I discovered a shortcut that boosts speed tremendously. When parsing a word, the engine used to spend between 250ms and 500ms parsing each word! That was always disappointing to me, but I had gotten around the problem by caching the results. Now, however, word parsing takes about 25ms!
Why bother improving the speed? Because soon I will be implementing word lists and frequency lists! A word list, of course, is just a "mini-lexicon" that defines only the words in your chosen passage, and a frequency list is a list of words in order of how often they appear in a passage. The word list will be helpful to quickly work on vocabulary for a passage, and a frequency list will help Latin students study more effectively by giving them the most frequent words first. I'm very excited about this feature, but I don't anticipate it will be done before January 10th (giving me the winter holiday to work on it).
That's all for now!
My main motivation for making this update is because certain passages stored in The Latin Library reflect the older conventions of using J's for consonantal I's or U's for both consonantal and vocalic V's. Numen's parsing engine was having trouble recognizing forms like jecit (iecit) and uuius (vivus). So now as a result -- after a bit of work -- the engine is updated and now recognizes more possibilities than ever. Incidentally, internally J's are stored as I's and U's are stored as V's.
Another project I completed at the same time is an order-of-magnitude speed improvement for parsing. I was trying to figure out ways to make the engine faster and I discovered a shortcut that boosts speed tremendously. When parsing a word, the engine used to spend between 250ms and 500ms parsing each word! That was always disappointing to me, but I had gotten around the problem by caching the results. Now, however, word parsing takes about 25ms!
Why bother improving the speed? Because soon I will be implementing word lists and frequency lists! A word list, of course, is just a "mini-lexicon" that defines only the words in your chosen passage, and a frequency list is a list of words in order of how often they appear in a passage. The word list will be helpful to quickly work on vocabulary for a passage, and a frequency list will help Latin students study more effectively by giving them the most frequent words first. I'm very excited about this feature, but I don't anticipate it will be done before January 10th (giving me the winter holiday to work on it).
That's all for now!
Labels: accuracy, database, development, features, frequency lists, google cache, orthography, parsing engine, performance, slowness, vergil, word lists
Wednesday, May 27, 2009
Speed Improvements
Sometimes I take a little bit of time off from reading and cogitating to work on important stuff -- stuff like speed improvements for this website.
This is incredibly nerdy stuff. It actually takes my mind off harder things. Don't ask!
The biggest improvements came in database queries. Some of the queries I was using were executing more slowly than I would have expected. In researching this problem I discovered something called prepared queries. I had no idea they would improve execution speed of certain queries by nearly 10x! On the back-end of things, that's a considerable improvement. On some pages it reduced the overall server load of each page by half -- to 35ms from 65ms! On the front-end, the site will probably feel a tiny bit snappier. Overall your average page load will reduce from about 160ms to about 130ms (since it takes about 100ms for intercommunicative data to traverse the internet from your computer to the server and back). That may not seem like much on your end (a 15% drop in latency) but on the server side it's quite dramatic (a 50% drop in latency).
This is incredibly nerdy stuff. It actually takes my mind off harder things. Don't ask!
The biggest improvements came in database queries. Some of the queries I was using were executing more slowly than I would have expected. In researching this problem I discovered something called prepared queries. I had no idea they would improve execution speed of certain queries by nearly 10x! On the back-end of things, that's a considerable improvement. On some pages it reduced the overall server load of each page by half -- to 35ms from 65ms! On the front-end, the site will probably feel a tiny bit snappier. Overall your average page load will reduce from about 160ms to about 130ms (since it takes about 100ms for intercommunicative data to traverse the internet from your computer to the server and back). That may not seem like much on your end (a 15% drop in latency) but on the server side it's quite dramatic (a 50% drop in latency).
Labels: ajax, database, development, prepared query, web server
Wednesday, August 6, 2008
Prettying up the Joint
So I haven't had as much free time as I wanted this week. I've been busy at my day job getting everything "taken care of" before my last day there on August 13th. After that, I'll just be contracting with them for 5 hours a week, because I'll be a full time teaching and grading assistant for the University of New Mexico Foreign Languages and Literatures Department. I'm pretty excited about that!
Nevertheless, I had a few hours free tonight, so I did some sprucing up. I made some icons, fixed some style sheets and squashed some small bugs. There are a few things I want to include before the semester starts:
Nevertheless, I had a few hours free tonight, so I did some sprucing up. I made some icons, fixed some style sheets and squashed some small bugs. There are a few things I want to include before the semester starts:
- Add a few pronoun paradigms: is and iste for sure.
- Add some verb paradigms: perhaps volo verbs.
- Fix up the database backend, especially in the realm of update cascades (it's technical, and you're probably wondering what that means -- don't worry, it'll make things better).
- Speed up the morphology lookup. It's not slow by any means, coming in at approximately 100 milliseconds per word. But still, I think I can get it down to 40ms. Every bit helps, especially if this site ever gets popular!
- Make a new database and web server. Right now it's being graciously hosted at the place I work here on campus (Natural Heritage New Mexico). I've been the system admin there for about 5 years, but now -- as I wrote earlier in this post -- I won't be there for very much longer.
Labels: bugs, database, icons, NHNM, paradigms, pronouns, style sheets, UNM, verbs, web server
