Wednesday, October 14, 2009
J's and U's Updated / Speed Increases
I mentioned a few weeks ago that I planned on making I's/J's and U's/V's look the same on the back-end, while preserving their traditional orthographies on the front-end. I've just completed this task!
My main motivation for making this update is because certain passages stored in The Latin Library reflect the older conventions of using J's for consonantal I's or U's for both consonantal and vocalic V's. Numen's parsing engine was having trouble recognizing forms like jecit (iecit) and uuius (vivus). So now as a result -- after a bit of work -- the engine is updated and now recognizes more possibilities than ever. Incidentally, internally J's are stored as I's and U's are stored as V's.
Another project I completed at the same time is an order-of-magnitude speed improvement for parsing. I was trying to figure out ways to make the engine faster and I discovered a shortcut that boosts speed tremendously. When parsing a word, the engine used to spend between 250ms and 500ms parsing each word! That was always disappointing to me, but I had gotten around the problem by caching the results. Now, however, word parsing takes about 25ms!
Why bother improving the speed? Because soon I will be implementing word lists and frequency lists! A word list, of course, is just a "mini-lexicon" that defines only the words in your chosen passage, and a frequency list is a list of words in order of how often they appear in a passage. The word list will be helpful to quickly work on vocabulary for a passage, and a frequency list will help Latin students study more effectively by giving them the most frequent words first. I'm very excited about this feature, but I don't anticipate it will be done before January 10th (giving me the winter holiday to work on it).
That's all for now!
My main motivation for making this update is because certain passages stored in The Latin Library reflect the older conventions of using J's for consonantal I's or U's for both consonantal and vocalic V's. Numen's parsing engine was having trouble recognizing forms like jecit (iecit) and uuius (vivus). So now as a result -- after a bit of work -- the engine is updated and now recognizes more possibilities than ever. Incidentally, internally J's are stored as I's and U's are stored as V's.
Another project I completed at the same time is an order-of-magnitude speed improvement for parsing. I was trying to figure out ways to make the engine faster and I discovered a shortcut that boosts speed tremendously. When parsing a word, the engine used to spend between 250ms and 500ms parsing each word! That was always disappointing to me, but I had gotten around the problem by caching the results. Now, however, word parsing takes about 25ms!
Why bother improving the speed? Because soon I will be implementing word lists and frequency lists! A word list, of course, is just a "mini-lexicon" that defines only the words in your chosen passage, and a frequency list is a list of words in order of how often they appear in a passage. The word list will be helpful to quickly work on vocabulary for a passage, and a frequency list will help Latin students study more effectively by giving them the most frequent words first. I'm very excited about this feature, but I don't anticipate it will be done before January 10th (giving me the winter holiday to work on it).
That's all for now!
Labels: accuracy, database, development, features, frequency lists, google cache, orthography, parsing engine, performance, slowness, vergil, word lists
Wednesday, December 24, 2008
New Server and Speed Increases
It's finally here and up and running!
I bought a new server. Did you know you can get slightly older computers, but still really powerful, for super cheap? People and businesses upgrade and then basically give their computers to discounters for nothing! I got this server for $144, with tax, shipping and an extra year's warranty. I'm very impressed.
Also, U.N.M (University of New Mexico) gave me a static IP address on their network, so we have a super-fast internet connection.
So, if you're used to this site being slow, get ready for serious changes! In general, moving to this new server on this new internet connection has increased the speed by an order of magnitude (from 300ms per request to 15ms per request). Wow!
But that's not all folks! I've also done some back-end coding to cache the results of morphology lookups. So now morphology lookups should increase by another order of magnitude (as long as a word is cached). If the word is not cached, the lookup will still be 2-3 times faster.
I apologize for geeking out a bit here, but I hope you notice the speed improvements.
As usual, I'm always developing The Latin Lexicon, but since I'm on winter break, expect to see some serious improvements for January!
Oh, one more thing. I also set up some bug-tracking software (BugZilla) to keep track of issues and improvements. So if you find all this technical stuff interesting, feel free to check it out!
Ok, one more thing! OpenID logins will be down for a day or two. Also, if you created an account or any flashcards between the 16th of December and today, I'm afraid that information is lost because I upgraded the database on the 16th and didn't get it moved until today. Sorry about that, if you're affected.
I bought a new server. Did you know you can get slightly older computers, but still really powerful, for super cheap? People and businesses upgrade and then basically give their computers to discounters for nothing! I got this server for $144, with tax, shipping and an extra year's warranty. I'm very impressed.
Also, U.N.M (University of New Mexico) gave me a static IP address on their network, so we have a super-fast internet connection.
So, if you're used to this site being slow, get ready for serious changes! In general, moving to this new server on this new internet connection has increased the speed by an order of magnitude (from 300ms per request to 15ms per request). Wow!
But that's not all folks! I've also done some back-end coding to cache the results of morphology lookups. So now morphology lookups should increase by another order of magnitude (as long as a word is cached). If the word is not cached, the lookup will still be 2-3 times faster.
I apologize for geeking out a bit here, but I hope you notice the speed improvements.
As usual, I'm always developing The Latin Lexicon, but since I'm on winter break, expect to see some serious improvements for January!
Oh, one more thing. I also set up some bug-tracking software (BugZilla) to keep track of issues and improvements. So if you find all this technical stuff interesting, feel free to check it out!
Ok, one more thing! OpenID logins will be down for a day or two. Also, if you created an account or any flashcards between the 16th of December and today, I'm afraid that information is lost because I upgraded the database on the 16th and didn't get it moved until today. Sorry about that, if you're affected.
Happy holidays! Happy Hannakwanzaamas!
Labels: bugs, development, features, flashcards, openid, slowness, UNM, web server
Friday, November 7, 2008
Temporary Server Move
The temporary home of The Latin Lexicon is now on a slowish public web server until a grant comes through or a new server arrives. I hope to have one up in the next two weeks.
So, as a result, the site will be 5-10 times slower. That's not to say it's deadly slow, but it will be a noticeable change. The good news is that everything seems to still work, and within a week or two everything will be back to the same speed it was yesterday, perhaps even faster.
Thanks for your patience and understanding.
So, as a result, the site will be 5-10 times slower. That's not to say it's deadly slow, but it will be a noticeable change. The good news is that everything seems to still work, and within a week or two everything will be back to the same speed it was yesterday, perhaps even faster.
Thanks for your patience and understanding.
Labels: grants, slowness, web server
