2013-02-10

The Cyborg and its Children

This post is a followup to Hello hullo hallo hella hulla, the post detailing the reasons behind why I got my bilateral implants. As a bit of a disclaimer: Please understand that my results are purely anecdotal. Cochlear Implants have generally had a success rate of 2 out of every 3, where success is "being useful enough to the wearer in the long term". Success rate is considerably lowered the longer one goes without hearing, naturally. I am, in addition, one of the far outliers of the bell curve of success.

A bit of an aside - shortly after I published that post, my right implant failed. At the same time that was ascertained, there was a recall from the manufacturer on that particular chipset for a higher than acceptable rate of failure. However, the re-implantation process was painless (Oxycodone for the win!), lost wages during my recovery weeks were reimbursed by the manufacturer (Cochlear), and it wasn't long again before I was once again wearing two. The re-implantation surgery was on Halloween 2011.


The Cyborg

It is now a bit over a year and a half since I first had my implantation, a year and 3 months since my re-implantation. As for the results? Far beyond what I had expected.

In mid-November, when both implants were in working order, I called my sister on the phone. For the past 30 years with my hearing aids the phone was always a Charlie Brown experience for me: "Wah wah wah wah wah, wahwahwah wah wahwah wah". But with my implants, I was able to carry on a conversation almost flawlessly with my sister. It was incredible. I was similarly able to call and talk to both my father and my girlfriend (now wife) on the phone.

Is my hearing "fixed"? No - not in any way. It requires preparation to be able to talk to my sister on the phone. I have to wire both my processors directly to my phone. My sister has to place herself in absolute silence (she locked herself in her car), and enunciate clearly. While I still can't understand most people, and only clearly understand those I've known for a long time, this is worlds beyond what I was able to do with my hearing aids.

It is also incredibly draining. Hearing people understand speech as a matter of course. They don't need to focus or concentrate, they just absorb speech, surrounding conversation, etc. For me, it is draining. Incredibly so - It takes a heck of a lot out of me to understand anything going on. Probably the best analogy is for a hearing person listening to and learning speech that has numerous phonemes not usually in one's vocabulary.

But other than speech - the sounds of the world in general are a lot sharper and more noticeable. The sounds I can now hear - and more amazingly, distinguish - are even now a continuing new experience. My wife and I went camping recently, and I stood in the middle of a forest that would normally have been silent to me - I could distinguish the chattering of the creek from the wind rustling the leaves. I could hear cars and motorcycles zooming by on a distant, unseen highway.

So, suffice it to say, I effectively became a convert, supporting Cochlear Implants. Largely for the reasons listed in my Hello hullo hallo hella hulla, reinforced by my particular success.


The Children

However, a number of individuals have asked me since then where it makes me weigh in on the "Should deaf children be implanted (before they can consent)?" debate.

Understand one thing: I hold no truck with those who argue "It fixes deaf people" or its counter, "Deaf people are not disabled." Both are utterly false.

"It fixes deaf people" - The CI is not a panacea. It's not "Presto! You're hearing, now!". The CI is, instead, "With luck, you can hear better than you could otherwise - and would probably be more useful than hearing aids, depending on your causes." There is, at present, no cure for deafness.

"Deaf people are not disabled." - Understanding this argument requires understanding two parts: the capital-D 'Deaf culture' and self identity. For many people, deaf and otherwise, self identity is very strongly tied to independence. It is harder to imagine oneself as independent if they first believe that they are disabled. This has given rise to a culture of deaf people who believe that deafness is not - indeed, cannot be - a disability. To this, I say: The ability to hear is not required to live independently. Similarly, there are other disabilities that are not handicaps to independence - You can live a good life independently with one or two limbs missing, etc.

With those two arguments taken in one hand, the context of 'hearing' is entirely defined as a tool. For deaf people, that's all that hearing can ever really be. (I am not talking here of those that have only mild hearing loss, but those who are medically deaf.)

Now on to the argument:

There is one thing that is extremely critical to children, each and every one of them. Language. Without language, any person would be little more able to express themselves than a feral child. For a deaf child of hearing parents, this would be an extremely difficult task.

There is one parable that is quite applicable here: A deaf child will never see or hear a word of speech that is not directed at them. Even with a CI, they cannot hear well enough to pick up language through absorption like a hearing baby.

If your child is deaf, and you give them a CI without choosing to use sign, you will then need to ensure, constantly, that it is in working order, and that you, your family, and your community constantly talk directly to your child and you monitor their progress in speech and language.

Choosing oralism without CI is even more difficult than with a CI - Hearing is most definitely an extremely useful tool with lipreading.

If you instead choose to learn and teach them sign language, you will still drastically need to alter your life. You will need to sign everything you say, not just when you want to directly talk to them.

My own preference - Total Communication - is a mix of both, probably even more work than the two combined: Using a new language (sign language) in addition to teaching a deaf child to lipread and speak - It's not going to be easy, but it's worth it.

Simply put - No matter which choice a parent chooses, it's going to be lot of hard work. The choice of whether to implant their child is going to vary wildly depending on which choice of life the parent themselves make.

And that is not a choice for uninvolved people - whether Deaf Culture advocates or Cochlear Implant fanatics - to do more than offer suggestions and anecdotes on.

2011-09-24

One month of blogging: Redux

As I promised myself, it's been 1 month since I started blogging.

Have I succeeded in "blogging"? That depends on the definition of "success". For my own: I've enjoyed having a blog, and it does feel nice to get my projects out there, rather than bottled up on my laptop or server where they'd never see the light of day. It's caused me to push at least one project: CarBoundary, to github that I otherwise wouldn't. I also plan to push others to github just to make 'em public - there's no reason to keep 'em private!

In addition, part of the driving force behind bzwikipedia is that I'm planning another blog post announcing it. It's amusing, in a way: I made this blog to showcase what I've done, and now it's _driving_ something I'm doing!

Excepting the introductory post and this one, I have 11 posts by this time. However, what I don't have is any constructive criticism: Suggestions to talk about this, not talk about that, etc. I do hope, though, that that will come with readers. Not that I'm blogging for fame, it'd just be nice to have readers with opinions :D.

So for the time being, I'm going to ... Give it another month. Can I keep it up? It's already lasted longer and has more posts (and definitely more content) than my failed attempt at being social-networky with G+!

2011-09-23

Fender-bender Defender!

Hey, it's the 21st century! Why don't we have instant 3d mapping tech available for cheap, yet?

As I mentioned in my post detailing my implementation of xkcd/941, I had a pair of Kinects from an earlier project.

Back when the Kinect first came out, I grabbed a pair - like many other people, I had ideas for it. Then, as usual with new tech, the limitations came to light, some of the possibilities were explored, and hype quieted down.

So, why did I need two?

Well: I've always wanted to be able to see a top-down view of my car's surroundings. One, obviously, would not be enough. I figured: If I could get it working with two, I could slap a bunch around my car and get it all working together.

So I got started. Here's what I wanted to make:
The black box is your car, gray is safe driving area.
Green boxes are other cars, light green is curb level,
blue dots are trees, people, bikes or other small, moving obstacles.

I wanted to replace the back-up camera with that. If you can see visibly how far away your car is from the curb or another car, you won't need to guesstimate how far it is with the fish-eye lenses of your back-up camera or the "Closer than they appear" margin of error in your mirrors.

So I got started. Step 1 was to get a visual display of the depth sensor:

A view of my kitchen from my desk, with a camera tripod in the middle.

Then I buckled down and got working. Mathy stuff: Converting distance as seen from the camera to a top-down plot:

A view of my living room from my desk. Top down (left) and camera's eye (right).


Next step was to take two controllers and use 'em to see around each other:

Annotated display. Darker blue = taller object


When moving the camera (and updating their location values appropriately), it could render the table as a sort of fuzzy circle: The two Kinects were successfully working together to see around obstacles.

At this point, I figured it ready for a road test. I took it out, set it on my car, plugged everything in, and ...

The maximum range fizzled to 2 meters: The sun, on an overcast day, overpowered anything beyond that. Other than that, it worked surprisingly well. And at nighttime, it showed fairly clearly the nearby cars in the parking lot. They just looked like weirdly distorted rectangles. (Note to self: Go out and take a video in the parking lot, just for the blog.)

So, back to the drawing board. And this time, starting from scratch with the specifications and giving a lot more thought to the capabilities of the Kinect.

* To generate the pictures above chewed up a huge percentage of my CPU. Sure, there was a lot of optimization that I could do, but enough to reduce the CPU usage to a low enough point to where it was feasible to do on a _low power_ chip designed for use in a car? Probably not. And not cheaply.
* Aside from the measurements, I'd need a geometric engine on top of it to fit basic geometric shapes to the different objects, so I could generate something that's not very confusing for the driver.
* To get any sort of real coverage, I'd have to cover the car in Kinects. I was originally thinking 8: Two at each corner, assuming the technology could handle 90 degree horizontal angle, and "as much as possible" 80 degree angle. But the way the Kinect works isn't time of flight, like I originally assumed, but it uses spatial differences in a projection to guess the height. I'm guessing the widest angle the technology behind the Kinect could handle would be 45, and that's pushing it.
* The sun kills almost all hope for the UV tech in the daytime.
* libfreenect requires that each Kinect be on a _separate_ USB _card_. The laptop has two. Using a hub of Kinects would only confuse libfreenect.

Put simply, it'd take a lot of funding, research, hardware, and a different form of mass distance measuring tech to get something I could turn into a viable product. And somebody with business sense to turn it into something other than a cool lab product or blog post :D.

If you want to give my CarBoundary a shot, help yourself! As usual, the project is up on GitHub: https://github.com/captdeaf/CarBoundary

2011-09-15

Bananananana

As I've said before, I spend a lot of time on a PennMUSH: Named M*U*S*H. It's been almost purely social for the past 12 years: A core of maybe 20-ish players that are regularly active, and up to 100 or so more that pop on and off occasionally. And I can count pretty much of all them among my friends.

There are software geeks, lawyers, students, college professors, schoolteachers,  mechanics, electricians, engineers, and more from all across the globe: Australia, Netherlands, Japan, Norway, Canada, France, Croatia, England, and more. If there was one "flaw" among the group's demographics, it's admittedly STEM (Science, Technology, Engineering and Mathematics) heavy: But those people do so frequently help each other with learning any language. You name it, somebody on the M*U*S*H has most likely done it. OcaML, Clojure, Scheme, Haskell, Erlang, all the .NET languages, and more.

Earlier this year, there was some discussion on MU* Clients: I've done a lot of work developing the server side of PennMUSH, but we speculated quite a bit on the future of the client side. One of the M*U*S*Hers, known as Talvo, maintains a TCL client named Potato. A few of us decided to give our hand at trying to make a client that might appeal to wider reaches. After the naming scheme favored by Potato, we gave named ours after other foods.

My 'entry' was Banana. Although strictly speaking, Banana is the server side, and it has several front-ends with their own name.

Banana went through multiple evolutions:

I first wrote it in Ruby, being able to fairly quickly get it going. It was good, but its threading setup seemed to be a little slow: Green threads aren't always the fastest to react when sending signals back and forth with sockets in the equation.

So I rewrote it in JavaScript, on Hairy Rhino. It worked well, and it worked fast. But it ran into problems when left running for a while: There was so much string manipulation going on in the JavaScript side, and the Rhino engine was doing it so poorly, that it started hitting OOMs despite all the checking and double checking I could enforce. Forcing GC calls worked, but slowed it _way_ down. Not acceptable. (I've since looked into how Rhino does it: It was ending up creating thousands of copies of each string during a regexp-replace!)

So, back to my home sweet home: C. Sure, I could've fixed the JS version by rewriting that little bit in Java, but where's the fun in that? Besides, it's been a long time since I got to write something in C, so C it was.

Writing it in C also forced me to refine the API to exactly what's needed: The JS (using jquery) on the client side would do the rest.

So, I did: http://client.pennmush.org/API.txt

And here's the client (You can use it to connect to M*U*S*H, or you can request an account from me on M*U*S*H to connect to _any_ mush using it.):

http://client.pennmush.org/

For icing on the cake, and thanks to the awesome folks at jquery, it runs on more platforms than any other mu* client that I know:
Nintendo 3DS (Photo credit: Chaz@M*U*S*H)

Kindle.

Anything with a relatively modern browser supporting just enough JavaScript for Jquery's AJAX features can use Banana.

Feature list:

* 'Guest' accounts: That are limited to one host:port. So mushes can create personalized "Come give us a try!" pages.
* Logging: All output from worlds are logged (not input!)
* Multiple worlds for logged-in user accounts: Connect to not just one mu*, but all the ones you frequent!
* Charset negotiation: UTF-8, latin-1, etc.
* 16 and 256 Color support
* Multiple logins: If you log into your account multiple times from different devices at the same time, they all see the same connections!
* Multiple front-ends: Once you're connected (either as a user or as a guest), you can use different front-ends: The primary: WebFugue, the fullscreening 'KindleClient', the "for development purposes only" WebCat, and two proof-of-concept front-ends: DutchMush (translates all incoming text to Dutch) and the super-wacky EngrishMush (incoming text translated to a foreign language, then back to English)

It's not yet intended for serious mudding: While there's definitely potential for them, (There's an API for saving and loading files for use with 'serious' clients!), there is, as of now, no support for triggers, macros, etc. The speed is there. The power is there (and in jquery). People can use it to make FaceBook client apps. I have somebody using it regularly on her Kindle while she rides to work. Another uses his 3DS to mush when his brother uses the computer to play games. (Isn't that backwards?)

As I'm not really interested in doing visual and client design, I'm about finished with what I want to do with Banana. For now, Banana is sitting on the backburner as a project. But if anyone wants to work on a front end, run your own back-end, or whatnot, you can get it, as usual, on GitHub: https://github.com/captdeaf/banana-muclient

(And if you're a mudder and want an account, contact me on M*U*S*H: mush.pennmush.org 4201, player Walker. Or just connect as a guest on client.pennmush.org =).

2011-09-07

Hello hullo hallo hella hulla

One of my earliest, strongest memories is the day I got my first set of hearing aids. I don't remember the testing. I can't recall the fitting. What I do remember is the zoo and the long drive home, with my endless "Hello"s.

I was born and raised in Flagstaff, and at the time, they had very few services for deaf people. So my parents drove to Phoenix, some 2 hours away. I got my hearing aids, and the first thing I did after we left the office was to begin constantly repeating "Hello" in many different ways. "Hello. Hullooo. Hallo. HELLO. hello. Hulloooohhhelloellooo."

After leaving the office, my parents took my brother and me to the Phoenix zoo. They wanted to know if I could hear the various sounds the animals were making. Unfortunately, all I was doing was making more "hello, hello, hello" sounds. That lasted all through the zoo and on the way home. I was _fascinated_ with this new sensation.

While I'm sure they tired quickly of my hellos, they did later tell me that they were excited at my excitement. Thus assisted, my hearing, while nowhere near "normal", was "acceptable." I could understand my family, I could understand my teachers and peers, and I had no problem being called "weird ears."

That was 1982. At the time, my parents were informed of the really early stages of cochlear implant development. Fortunately (in my opinion), they decided that the cons at the time outweighed the pros.

By almost 30 years later, in 2010, nearly all of the cons involved in implants were obsoleted by advances in technology:
  • Considerably smaller sizes of implants and surgical tools allowed for less invasive surgery, and allowed the skull to remain nice and thick. No longer would a basketball bouncing off the implant point risk brain damage.
  • Stronger materials allowed for longer lasting, better performing implants.
  • Water diving depth increased from 30 feet (1atm) to more than 100 feet. (4 atm).
  • General technical quality improvements: More electrodes, finer control, etc.

Conversely (or inversely?), my hearing went the other way:
  • My word and sentence recognition dropped from 80% as a child to 40% by high school, and 0% by college.
  • My decibel loss went from 50 to 80 by high school, 95 by college, and 115 by 2010.
  • The range of my hearing shrunk. Even amplified, I couldn't hear high pitched noises.
  • My left ear developed some weird issues that made it constantly sound like an echo chamber, giving me headaches so I stopped using hearing aids in my left ear altogether.

Then, in December of 2010 came the final blow to my use of hearing aids: I began experiencing something known as Tullio's Phenomenon, most probably caused by some swelling and scarring in my inner ear. Whenever I used a hearing aid in my right ear, the sheer volume it pumped noises out at (115 db! Louder than a jet engine!) would rattle my balance organ at a very high frequency, making me violently dizzy.

So for me, the choice was between not hearing at all, or a cochlear implant. This triggered a lot of thinking: Did I want to hear? What would make me want to hear? Would I be happier if I went ahead and got an implant, or just went full deaf?

Obviously, I had a lot to think about.

For the past 15 years, I've viewed hearing as primarily a tool. With word and sentence recognition at 0%, that's all it really could be. It helped, a little, with lipreading. It was nice to hear sound effects at movies. Friends and family in the same room could hoot or shout to get my attention. When using the radio in my flight lessons, I could hear whether there somebody speaking or not. (Not what they were speaking, just an "Oh, there's noise."). Hearing the beat in some types of music. Listening to rain and thunder in the silence of the night.

It also added a personal touch to many things: The voice of my girlfriend as she says "I love you", the sound of my niece and nephew saying "Unca Geg!", picking up on the laughter of a circle of friends, and all the little touches of sound that fill a little bit more color into my memories, like giving a child with a coloring book a few more crayons for just that much more detail.

While I can easily imagine living a good and happy life without the tool or crayon that is my sense of hearing, it appeared to me to be a shade dimmer - a slightly muted life experience.

I'm 31. If my grandparents are any indication, I'll be well into my 90s before I kick the bucket. Is the cost (in both cash and 'cons') of the cochlear implant worth another 60 years?

Given the lack of anything really bad with today's implants and those thoughts, it really was a no brainer at this point: Yes, it's definitely worth it to me. If I later decide otherwise, there's really no harm done other than a small lump on both sides of my skull.

So - I got the implant. July 11th was the surgery. What followed was one week of "Whee, painkillers", then one week of "Waugh, annoying annoying annoying!" (Bilateral implants meant that I couldn't sleep on the side of my head, which is the only way I can really comfortably sleep! :-/.). Then on July 25th, I got my processors connected.

"Hello, hullo, hallo, hella, hulla".

2011-08-30

Real World Minecraft?

I created an implementation of XKCD's "Depth Perception" comic: http://xkcd.com/941/

My equipment:


  • Two Kinects (Because I had them from an earlier project). I just used 'em for their webcam capabilities. They each have a USB cord length of 10 feet, letting me have "eyes" 20 feet apart.
  • Two identical Canon fish-eye lenses.
  • My laptop.


What I ended up with:


A (larger size) screenshot of what my output was while testing at home:



I took it outside to look at distant things, desiring to experience the "Giant among the Clouds" feeling portrayed by XKCD.

What I got:

  • A headache
  • Cross-eyed
  • Clouds and buildings that are so pixelated, they look like they belong in minecraft.
I think I'd have more success if I had a pair of identical lenses, two laptops, high def webcams, and some cardboard to create a viewer with. In addition, if the webcams of the Kinect were narrower and focused in the distance, it might help. Unfortunately, the equipment to make that happen is out of my "Just trying something cool" budget. =)

I'm currently on vacation now, but when I get home, I'm going to try using blue+red 3d projection to give wide depth perception, hopefully with fewer headaches =).

2011-08-27

Zombies are awesome. So is Wikipedia.

Zombie movies are cool. Zombie books are even better. Many a geek fantasizes about what they would do when the zombies come. Memorize safe locations and make foolproof plans. Practice Rule #1: Cardio. Design their own version of a "Lobo". Create their own gated community with zombie tests, and blog about it. Nobody cares that it's scientifically impossible. It's just plain fun to imagine a world that's black and white, where you're fighting a clearly defined enemy without morally gray areas, and having to put your wits to the test.

Then there are people with other apocalypse scenarios they love to play through: Nuclear Armageddon. Waterworld-scale climate change. Political and economic collapse. Civil War. Massive, uncontrollable plague. Meteoric destruction. Worldwide natural disasters. Y2K-style computer bugs or viruses. The Other Political Party Wins. Rapture. Peak Oil. Peak Food. It's all a game of survival: How will you make sure that _you_ come out looking pretty after disaster strikes?

At least, they're all fun to think of as long as you're sitting on your couch, watching TV or talking online with friends. Long term visits to Zimbabwe, Somalia, the Congo, Argentina (circa 2000-ish), Pakistan, and Afghanistan, though, are definitely not on the Survival Game lovers' travel plans.

Still, there is a lot of sense in preparing. Disaster _can_ strike. Let's look at the past 10 years.

1) Tsunamis taking out a lot of towns and villages, leaving millions stranded and starving? Check.
2) Massive earthquakes, destroying a huge number of homes and leaving millions stranded? Check.
3) Hurricanes taking out levees and drowning towns, with government not responding for over a week? Check.

Even without the daydreams of the Survival Game lovers, disaster _can_ happen. So prepare.

Certainly, there's the basics of survival: Food. Water. Shelter. But given our Google-rotted brains and dependency on the internet and authorities who aren't us, there's something even more crucial to us: Information.

Post-Apocalypse lovers hoard information by the gigabyte: How to build an off-the-grid home. Medical databases. Water filtration. Farming.

But there's a huge amount of knowledge, and you can download all the PDFs you want, but managing all that information is a hassle. Thankfully, there was a group of awesome people that made an easy way to catalogue and access a huge amount of information: Wikipedia.

Sure, there's not many details, and you'll have to come up with plans on your own, but there's a lot of information there for the Survival Gamer:


Lots and lots and lots of knowledge, all in an easy to access format. Sure, Wikipedia can often be incorrect or deliberately wrong, but by and large, it's a better jack-of-all-trades knowledge database than many of us have access to.

But unfortunately, many disaster scenarios preclude internet access. Loss of power from downed lines. Loss of internet access from cut cables. Being stuck on the road. Living in your log cabin in your mountain getaway.

Sure, you can print out those gigabytes of data on paper, but why kill a forest? (The book in that image supposedly only has 1/10000th of Wikipedia!)

For those times, you'll want to run Wikipedia on your laptop. Power is a lot more reliable than internet - Laptops usually only need around 60 watts, and a DC->AC inverter for your car can put out 200 watts. Survival Game lovers like to tout generators - Your car is already a power generator, it just needs a $30 inverter. Or you might have solar panels, wind power, or more. With some help from Wikipedia, you can even make your own generator.

Fortunately, Wikipedia lets you download a copy of the entire database.

Unfortunately: While it's "only" 7GB compressed, uncompressed it takes a whopping 31GB, in a barely usable xml format. If you want to run any of the variations of Wikipedia server, you'll need a database taking up at least 30 gb, then a huge amount of ram, . . . etc etc etc.

Luckily, somebody deciphered how to run Wikipedia off of the provided, compressed xml file - So you can keep and run Wikipedia off of only 7GB! Unfortunately, it's rather a pain to set up: Perl, Python, PHP, Xapian and Django.

My thought: Why not just one program to do all of it?

As at the beginning of every project, I set down my goals so I could review and decide on the next step:

1) I want it user friendly. So I can download the latest pages-articles, stick it in a drop directory, run _one_ program, and it'll do everything else. Maybe a config file for handling port binding.

2) I want it as a web service: I have no interest in writing a whole GUI just for it.

3) I want it to run on limited resources: It's for a home laptop use, for a single person. It shouldn't chew up the whole machine once it's set up and running.

4) I want fast title searching, such as Xapian serves. But I also want other features: Typo fixes, accents, different word order, etc. But with 11 million titles, it can be a hassle. grep took 12 seconds to find every instance of the word "free" in the title file. For me, this is the number to beat. (Edit: As of now, a title search takes ~ 1.2 seconds on my laptop)

My first attempt, Ruby: The script took 45 minutes to generate a list of titles from the segmented .bz2 files, simply using "bzcat" and looking for <title>...</title>. Unfortunately, it took a very long time to try and read the title list into memory, for fast indexing. 'top' showed it taking over 3 gigabytes of ram to keep just 345 megabytes of title data. At this point, I spent some minutes looking over it and wondering if there was a fault in my code, but all it was really doing was reading key:value into a massive hash. And Ruby was choking on it.

Next try: Google Go. I've been looking for an excuse to play more with it since I wrote a little string glob matching library as a "First Date with Go" bit. Go has: Package bzip2, which lets it read and write bz2 files. An extensible, and fast, http server. Actual system concurrency, which ruby doesn't have. It would take more effort to get it work on systems other than mine, but it can be compiled for all three major platforms. And, hell, I just want to learn more of it.

So I started over with the process: Make it recognize the latest timestamp (in filename: *YYYYMMDD*.xml.bz2), check if that's the current working one in the cached data/ dir, run bzip2recover, generate the title cache file, etc etc etc.

Some hours later, and I have something to show. It's ugly. It's basic, and I'll be improving on it for a long while, but if you have Google Go installed and a 7GB pages-articles.xml.bz2 installed, you can have wikipedia on your machine, taking only 7GB disk space for the articles, ~ 350 MB space for the title cache, and around 1.5GB of RAM when running. (I'm trying to think of a way to reduce that :D) (EDIT: It's now taking only 20MB when running up to around 80MB when searching, but it still gets up to around 600MB when building the initial index, a one-time run.)

The code that I'm using to convert MediaWiki markup to HTML is from the nifty InstaView.js, created by Wikipedia user Pilaf. It doesn't do everything needed, and I still need to get proper .css setup, etc, but at least it's not so raw anymore!

Feel free to grab what I've done off of GitHub: https://github.com/captdeaf/bzwikipedia