Wednesday, July 04, 2007

Beagle and makewhatis consuming resources

You know, I'm really not a fan of having programs doing things on my system without my consent. So I was a bit put off when I heard the fan start on my system the other day when I wasn't even using it for anything. The fan indicates that some system resources, mainly CPU, are being utilized and the system needs to kick into high gear in order to cool it.

Looking at the output of "top", I saw this:
top - 11:58:18 up 1:42, 1 user, load average: 1.41, 1.03, 0.49
Tasks: 126 total, 1 running, 125 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.2%us, 4.8%sy, 13.7%ni, 49.3%id, 30.5%wa, 1.2%hi, 0.3%si, 0.0%st
Mem: 2074372k total, 1131800k used, 942572k free, 164916k buffers
Swap: 2031608k total, 0k used, 2031608k free, 769072k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3846 beaglidx 34 19 81156 26m 9500 S 45 1.3 3:07.64 beagle-build-in
368 root 10 -5 0 0 0 D 2 0.0 0:04.34 kjournald
54 root 10 -5 0 0 0 S 0 0.0 0:00.07 kblockd/1


What the hell is "beagle-build-in" and why is it consuming half my available CPU?! A quick search on google yielded the information that beagle is a new search tool installed on Fedora since Core 5. I am using Core 6. Beagle (http://beagle-project.org) indexes your system drives is installed by Fedora Core 6 without your consent and consumes a good part of your resources for 10-15 minutes.

I understand that new technology like this can be a good thing, especially if you have a lot of files and need to search them frequently, but I want my system to run lean and mean. To me, having a program that is installed to indexing my system and consume resources without me knowing about it goes against the idea of Open Source. Is Fedora getting more like Windows every day? Ugh.

So how do you stop this thing from indexing automatically? There is an entry in /etc/cron.daily that you can remove:
[root@computer ~]# ls /etc/cron.daily/
000-delay.cron 0logwatch cups mlocate.cron tmpwatch
00webalizer beagle-crawl-system logrotate prelink
0anacron certwatch makewhatis.cron rpm


I also found another spot where you can disable beagle indexing. In the lower-right corner of Firefox, there is a little dog icon:


Click on the dog icon and a little red "X" will appear to indicate that you've disabled beagle's indexing function.

Finally, be aware that there is a user created in /etc/passwd for indexing:
beaglidx:x:58:58:User for Beagle indexing:/var/cache/beagle:/sbin/nologin

What an irritation! I decided to move the beagle-crawl-system program out of the /etc/cron.daily folder and put it in root's home directory in case I wanted to run it in the future. Beagle does have some interesting demos for those interested in seeing its very fast indexing capabilities:
http://nat.org/demos/

Soon after I removed the process from starting up every day, I noticed ANOTHER program kicked off and started utilizing my processor. WHAT IS GOING ON, FEDORA??! Now I'm starting to get angry. Again, I start "top" and this is what I see:
top - 12:19:05 up 2:03, 2 users, load average: 0.16, 0.29, 0.45
Tasks: 120 total, 1 running, 119 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.7%us, 0.3%sy, 0.0%ni, 97.5%id, 0.0%wa, 0.2%hi, 0.3%si, 0.0%st
Mem: 2074372k total, 1325996k used, 748376k free, 238596k buffers
Swap: 2031608k total, 0k used, 2031608k free, 869324k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3911 root 15 0 136m 58m 22m S 11 2.9 1:57.56 makewhatis
3444 root 15 0 61720 38m 9928 S 1 1.9 1:05.34 Xorg


Argh! What is this? A quick "man" on "makewhatis" yields this information:
makewhatis reads all the manual pages contained in the given sections of manpath or the preformatted pages contained in the given sections of catpath. For each page, it writes a line in the whatis database; each line consists of the name of the page and a short description, separated by a dash.

Oh..so it is indexing the man pages to give you a quick synopsis of a programs' functions when you type:
whatis <program name>

Like so..
[root@computer ~]# whatis ffmpeg
ffmpeg (1) - FFmpeg video converter
ffmpeg-devel (rpm) - Header files and static library for the ffmpeg codec library
ffmpeg-libpostproc (rpm) - Video postprocessing library from ffmpeg
ffmpeg (rpm) - Utilities and libraries to record, convert and stream audio and video


OK. Well, makewhatis did not take long to run and I did end up liking the simple output of whatis. It is case-insensitive too, as this output of a search on OpenEXR shows:
[root@computer ~]# whatis openexr
OpenEXR-devel (rpm) - Headers and libraries for building apps that use OpenEXR
OpenEXR (rpm) - A high dynamic-range (HDR) image file format


I will use this program in the future. As well, the makewhatis index function did not take up too much CPU (less than 30%) and only ran for about five minutes.

I'm starting to cool off now. But land sakes, do I hate when programs run without my knowledge!!

4 comments:

strenter said...

Just cool down a bit about automatical stuff. Some of them are a good thing, like you found out yourself. Oh, and if you like 'whatis' you might want to try 'apropos' as command. Where 'whatis' is looking only for the command word, 'apropos' is looking for strings in the description, too.
Another automatic thing that should run regularely is the cleaning up of the temp directory. Quite useful, if you wonder why your HD gets fuller. ;)

Cacasodo said...

Thanks there, Strenter. Apropos is nice!

I still don't like that Microsoft like behaviour of indexing automatically, though. Disabled by default, that's what I say!
:)
'sodo

Javier Sánchez said...

Hello everyone! I would like to add, automatic stuff are sometimes a good thing and a need probably... But it is never good to have anything running without you knowing. Of course, one never gets to know everything about linux and you go learning more about your computer as you run into these situations. And until you do, there are needed automatic tasks from your OS you need there until you get to learn how to optimize things yourself.Once this said, indexing tasks are always a pain and I don't know much people liking automatic indexing and is really always a heavy duty. I actually take a look at it everytime I do a fresh install so that I make sure it won't mess everything up when I need my computer.
Finally, I don't think that is a good idea for a server, to have any automatic indexing at all.

Cacasodo said...

Agreed..thanks Javier!

Feel free to drop me a line or ask me a question.