Saturday, December 26, 2009
Perfect Syncopation
18204 total word(s)But what does it mean???
17369 word(s) found
20 word(s) not found
815 word(s) ignored
0.11% of words not found
4.48% of words ignored
3264 unique word(s)
Well, I've just run the word analysis tool on Livy Ab Urbe Condita Book 2. The important thing to note is that out of eighteen thousand words, only 20 weren't parsed and found in the dictionary. That's pretty much amazing.
How did this happen? Well, two things had to happen. First, I ignore capitalized words that weren't located in the dictionary. Essentially, I'm ignoring proper names and place names. Second, I programmed Numen's ability to parse syncopated perfect verbs: laudasse (laudavisse), norat (noverat), et cetera.
I still have a bit of testing to do to make sure I didn't break anything, but this was one of the few major hurdles that I needed to overcome to get a nearly perfect parsing engine!
Labels: accuracy, features, paradigms, verbs
Friday, March 13, 2009
Verb Paradigms
Most verbs show up just fine, but of course some irregular verbs will show odd glitches. Therefore the data is "beta" but the paradigms should still prove helpful.
Here are some caveats:
So, work continues! Enjoy.
- certain irregular verbs will have weird forms, for instance, the participles for esse (which didn't exist until late antiquity).
- deponent verbs will show active forms. Remember that deponent verbs do have active participles, and the imperfect subjunctive is formed from the "reconstructed" active infinitive. I'm trying to imagine a way to "gray-out" the unused active forms, but I haven't decided fully on that yet.
- as a result of deponent verbs having "active" forms, they are now stored in the dictionary in their active forms, although on flashcards they will still show their deponent forms. So for instance sequor will be searchable under sequo.
- unusual forms, such as dic, duc, and fac will show up as dice, duce, and face. I haven't implemented and "irregular forms" system yet, even though I've half mapped it out. UPDATE: It turns out that Plautus was fond of using forms like dice, duce, and face even though they were later rejected by Terence.
- UPDATE: Some forms which are not known to exist (in other words, we don't have a record of them) but can logically be deduced will show up on the paradigm charts. For instance, the rare future active participle of volo, voliturus shows up and so does it's non-extant future active infinitive voliturus esse. Many grammar books will not show these forms simply because we don't have a record of them. Nonetheless, it is logical to assume they existed or would have been known to exist during Roman times (at least in theory).
Labels: active, deponents, irregular, paradigms, participles, passive, verbs
Monday, February 23, 2009
Again with the optimizing! Oy!
First, I did some server-side wizarding and sped up the website by a small margin (maybe 25%) and also stopped a nasty bug that was slowing down pages every once in a while.
I fixed some smallish bugs on the search pages. The user interface should be a tad more useful and friendly there.
While designing a new, awesome feature (verb paradigms!) I uncovered some paradigm errors. Those have been neatly squashed.
Soon, you will see a masterpiece in action! Full paradigms for all parts of speech! Right now I've got the verbs mostly working. I'll unveil this feature when I feel it's good enough for prime-time.
That's all for now! Carry on...
Labels: bugs, development, paradigms, web server
Tuesday, September 30, 2008
UNUS NAUTA
In a recent post, I noted a problem with the adjective neuter. In the neuter form, my parser was not discovering the forms for nominatives and vocatives. I dug into this problem and discovered some interesting facts that I never really took to heart.
UNUS NAUTA is not one declension, as many Latin grammars would have us believe. It is actually composed of four different declensions: the normal UNUS NAUTA declension (unus, nullus, ullus, solus, totus), then the alius declension (which is unique because its genitive is alius instead of aliius as we would expect; and it has neuter-singulars in -ud), next the R-type declension (neuter, uter), and finally the ER-type declension (alter). I can see why Wheelock compressed his declensions, but it turns out that he didn't spend enough time pointing out the differences! To be sure, they are minor, but somewhat important if you're writing software to parse out the different forms!
So I created 4 unique declensions for the 9 different types of adjectives. Now, the UNUS NAUTA adjectives parse properly!
Update: I fixed indeclinable nouns, too. I had forgotten to add their (non-) paradigm.
Enjoy!
Labels: adjectives, bugs, indeclinable, nouns, paradigms, unus nauta
Monday, August 11, 2008
Paradigm Updates
Labels: defective, demonstrative, paradigms, pronouns, verbs
Wednesday, August 6, 2008
Prettying up the Joint
Nevertheless, I had a few hours free tonight, so I did some sprucing up. I made some icons, fixed some style sheets and squashed some small bugs. There are a few things I want to include before the semester starts:
- Add a few pronoun paradigms: is and iste for sure.
- Add some verb paradigms: perhaps volo verbs.
- Fix up the database backend, especially in the realm of update cascades (it's technical, and you're probably wondering what that means -- don't worry, it'll make things better).
- Speed up the morphology lookup. It's not slow by any means, coming in at approximately 100 milliseconds per word. But still, I think I can get it down to 40ms. Every bit helps, especially if this site ever gets popular!
- Make a new database and web server. Right now it's being graciously hosted at the place I work here on campus (Natural Heritage New Mexico). I've been the system admin there for about 5 years, but now -- as I wrote earlier in this post -- I won't be there for very much longer.
Labels: bugs, database, icons, NHNM, paradigms, pronouns, style sheets, UNM, verbs, web server
Monday, July 28, 2008
Into the Great Wide Open
Another cool feature, one which is in development, is the flashcards feature. Anytime you see a word you want to study, just check the "I want a flashcard" option. Then, you can print out a list of your flashcards on Avery Business Cards! A planned feature is to be able to study your flashcards online, and keep "sets" of flashcards.
One feature which is not ready yet -- but coming soon -- is the paradigm creator. When completed, The Latin Lexicon will create a full paradigm for any word in the dictionary!
Currently under development is The Latin Lexicon for iPhone/iPod touch. If you have one of these fantastic little devices, give it a try!
There's a lot to come, so this application will remain in beta for a while. Even so, I hope you find it useful!
Labels: browse, development, features, flashcards, iphone, ipod touch, paradigms, search
