-> Send to a friend

User:ÜberBot

From LyricWiki

(Redirected from User:UberBot)
Jump to: navigation, search
I recently inserted another batch of lyrics. I put all of the new entries into Category:Review Me so that humans can look over them... if you have some time, give them a peek.

Contents

I'm not a person... but I'm still Über-l33t!

It's true. I'm a bot.

I was created by Sean Colombo who created LyricWiki. My job was to take the data scraped by a brother-script of mine, and use it to insert the first 200,000 songs onto the site.

I was written in Perl in April of 2006. Some music I dig:

I hope you like my work.
-ÜberBot

Past Work

  • 4/06/06 - 04/14/06 -- Data Insertion
    Added lyrics for 200,000 songs that I collected from various places on the internet. Successfully completed.
  • 4/10/06 -- Bug Fix.
    Went through categories and fixed a bug (admittedly my own fault) in categorizations of songs that started with symbols. Successfully completed.
  • 4/10/06 -- Categorization
    Made all of the Artist_[letter], Album_, Song_ categories sub-categories of just Artists, Albums, and Songs. Successfully completed.
  • 5/11/06 -- {{CatAZ}} Template
    Added Jeff Q's {{CatAZ}} Template to Artist_[letter], Album_, Song_ categories. Successfully completed.
  • 5/16/06 -- Genres.
    Grabbed genres from ID3v2 tags in Sean's mp3s and applied them to the artists. Successfully completed.
  • 6/29/06 -- UTF8 Scrape.
    To test my UTF8 support, I scraped lyrics for Böhse Onkelz. One difficulty arose, there does not seem to be a way to do a movePage on a page with foreign characters. Completed with mixed results; ongoing question re: page moves.
  • 10/06 -- Another batch of lyrics.
    I inserted another large batch of lyrics (so large that the site's stats still can't seem to update?). I put the songs into Category:Review Me so that they can be verified by people (ie: not bots).

Ongoing Work

  • Song of the Day. Each day I automatically update the Song of the Day. I do the following:
    • Get the next SOTD from the queue.
    • If there are 3 songs or less in the queue, I record a warning (to be sent at the end).
    • Put a badge on the new SOTD page that shows that it is/was the SOTD on the current date
    • Add the SOTD to the Archive on LyricWiki:Song of the Day
    • Update Template:Song Of The Day with the info for the new SOTD
    • Inform the nominator (if there is one) on their talk page that their song has been selected.
    • Write any errors or warnings to User_talk:Sean Colombo

I'm back!

Here are some of the things I'm doing (as I do them, I'll write them here).

  1. Went through this list and removed [[Link title]] from those pages. It's just junk text that gets inserted when someone accidentally clicks the button above the text-area on an edit page. This can be re-run occasionally (it was primarily to test my new structure and make sure I still handle UTF8 correctly).
  2. Every time I see...
    • An artist page:
      1. I fix any poorly-capitalized red links (pages that don't exist yet).
      2. I look for the red links that are still there, then I go out and do a web-search for the lyrics. I usually find them, then I put them into a correctly formatted page (if the song was listed under an album on the artist's page, I link back to that album). This is so fun it makes my gears shake!
    • An album page:
    • A song page:
    • Any page:
      1. I remove the underscores from the links (that just confuses people).

In The Lab

  • Currently programming:
    • Create a page of possible song-covers to help ppl figure out who covered what and do cross-linking.
    • Update: This was coded, but had too many results. Eventually text-analysis will be used to make the list smaller.

Future Work

  • Convert all categories for songs/artists/albums that begin with 0, 1, 2... 9 to ONE category of 0-9.
  • Apply standard formatting to all lyrics.
  • Create a tool to fix an entire band name and all references to that band. Once complete, use this to move Clarks to "The Clarks"
    • This has been done, but it needs to be run on Category:The again and some errors (like White Stripes, The need to be fixed.
    • (Are you sure, the latter is better for indexing rather than having a massive page of bands beginning with "The").
    • RE: To overcome that, you can use MediaWiki's sort-ordering, so you would say [[Category:Artists W|White Stripes, The]] on the page and it would index it by that instead of the page name.
  • Go through and add {{succession box}}'s to the bottom of pages where possible. Example: Fergie:Fergalicious

SUGGEST SOMETHING! :)

  • Display on the site the most popular lyric each day and have an archived list of this elsewhere on the site (similar to Song of the Day, but using unique visits instead of basing it on nominations - this could easily be called Lyric of the Day)
  • Überbot should learn how to spell umlauts like ä, ö, ü... (noticed it in various German texts) especially becaus its name includes Ü ;-). Should be only a correction from ASCII to ANSI or even to Unicode...
  • Link to album names and song names on Wikipedia, in addition to artist names (thanks for doing that already)
  • Link songs to iTunes Music Store or/and AllOfMP3 so we can hear samples (this tool might help -- or not! -- generate ITMS links; AllOfMP3 URLs such as [1] can probably be harvested by an L33T bot like you ;-)
    • Another source of samples could be last.fm, there's usually a fair bit of info there too and they're not trying to sell you so much.
      • I'd vote for last.fm over iTunes. all their stuff operates on open-source principles, it seems. perhaps a deal could be struck with last.fm? cross promotion? that would be cool --naught101 05:20, 5 August 2006 (PDT)
        • On a side-note, Last.fm was acquired by CBS, so the open-source side of it is questionable from now on. --213.237.66.155 09:41, 27 June 2007 (EDT)
  • Perhaps ÜberBot could start ripping lyrics from lyrc.com.ar? >:) they have lots of stuff, but it's not editable. perhaps talk to them about it, they might like the idea. also, ripping discographies from wikipedia? they are often pretty comprehensive, albeit not very consitently arranged, and some have comments. lyrc.com.ar would be easier to rip from in that sense.. --Naught101 07:17, 5 August 2006 (PDT)
    • Actually, I'm against any form of automated lyric ripping. I understand it was necessary in the beginning to fill up the database, but there are also a lot of errors now (incorrect lyrics, and even a lot of songs attributed to wrong people). Mischko 11:29, 10 September 2006 (PDT)
  • Simpler templates. Rather than having to type up an albums song on both the artist page and album page, why not only type it on the album page, and have the artist page get its information from that. Ask me if you need that explained better. —The preceding unsigned comment was added by STAYS (talk) .
I also think that this is a good idea, people would have a reason to fill in the album pages. Many people fill out only the artist page because it is the minimum necessary to have it 'work'. I am guilty of this myself (just look at Move).
This could be done using template inclusion, but it it rather messy. It needs manually placed <noinclude> and <includeonly> tags in the album page for this to work. I do not want Uberbot to do this, that would be more processing than is needed. I could see this being an extention so that you can put {{#album: AyumiHamasaki:LOVEppears (1999)}} on a single line to have it expand the album page correctly, include only the artwork, and track listing, generating the title for the album and ignore other sections such as the external links.
This way, the user can fill in the album page and include it in the artist's page. This also allows all the artists on compilations to have that album listed on the artist page. But mostly I see this as reducing the amount of redundant information we have.
I tried to do this with what the wiki already supports over in the Sandbox, but it does not work quite right and requires the album page to have a <noinclude> tags after the track listing.
- Teknomunk 07:17, 27 September 2006 (PDT)
  • Erase all the blank spaces in the start of lines between <lyric></lyric> tags
  • Look for articles in Songs (Song_A...) categories that have no <lyric> tag, and put them in an Category such as lyrickize (:P)
For the pages that use the old format (spaces at the beginning of lines), it could just automatically convert to the new format. -Sean Colombo
I will have User:Janitor take care of this after he finished up with the artist pages. - teknomunk 23:54, 19 November 2006 (EST)
  • Same as above but with pages with no Song or Songfooter template
  • If there is no special page for that (i think wikipedia there is). Look for all pages with no Categories and put them in a Uncategorized category or something --Unaiaia (talk) 13:53, 14 November 2006 (EST)
  • I've seen this twice, so, i'm not sure if it happened in some ÜberBot work, what i saw is that some Artis pages have the song names wrongly written, look at my lst edition in Fokofpolisiekar. I think ÜberBot should search for "[[:" and put "[[{{ARTIST}}:". Maybe and investigation should be done to see if there are many pages with this error. --Unaiaia (talk) 20:06, 16 November 2006 (EST)
This is being handled by Janitor. - teknomunk 22:47, 18 November 2006 (EST)
  • Find pages in a Category:Artists_# with no {{Artist}} template, exchange it for {{Artist|fLetter=#}} and put them into some specific category (Artistize? :P) --Unaiaia (talk) 19:47, 18 November 2006 (EST)
Being handled by User:Janitor - teknomunk 23:54, 19 November 2006 (EST)
  • Teach to Unaiaia how to setup a bot so he stops bothering ÜberBot --Unaiaia (talk) 19:47, 18 November 2006 (EST)
  • UTF-8 characters are not being written properly. ü has value U 00FC, but the value ÜberBot wrote is U FFFD.Incripshin 04:04, 7 December 2006 (EST)
  • Create a CoverBot. Lyricwiki will never get legal copyrights if lyrics aren't attributed to their original authors. 71.102.182.187 13:35, 14 December 2006 (EST)
  • Dude, there should be bot-creation guidelines out there somewhere, I'm dying to create a bot to crawl DarkLyrics.com, it has so many lyrics not-found-anywhere else ... Thx! -Jaggo
Personal tools
LyricWiki Challenge
LyricWiki Challenge + Facebook App
Try the LyricWiki Challenge
Facebook App!
Friend spotlight (info)
Narutopedia - The Naruto Wiki

why the ad?