April 2023

S M T W T F S
      1
2345678
9101112131415
161718192021 22
23242526272829
30      

Style Credit

Expand Cut Tags

No cut tags
Wednesday, January 13th, 2010 10:30 pm
There must be a way to have some kind of bot to do routine maintenance tasks that don't take any thought at all. Like when you make a new fan profile, and the fan is not consistently wikilinked, and the fan is prolific, manually inserting wikilinks to connect the article properly just sucks. Really sucks. I've inserted the same four brackets in dozens and dozens of pages just checking a few names. It took hours and is a task that requires no thought at all. (I'm not talking about difficult things like spotting variances or such just wikilinks that match exactly.) You run a search on the title of the article, go to the result list, check the first article for "is the first instance of this string in double brackets?" if yes, do nothing, if no add the brackets and go to the next result, and do the same over and over and over again. Couldn't there be some kind of maintenance bot checking this for all newly created articles and maybe the old ones in batches or something?
Tags:
Wednesday, January 13th, 2010 09:43 pm (UTC)
There's an extension that does this. Autolinking has its draw-backs, though, so they wikimmettee would need to weigh in on this.
Wednesday, January 13th, 2010 10:04 pm (UTC)
I haven't worked with the updated extension yet, but the old one drove.me.insane (was installed on the work wiki). Thing is, there's no way to *remove* linking, so if an article is deleted, you create a broken link. I basically hand-removed thousands of dead links in the work wiki. Yeah, still at it. UGH.

(actually, there's usually no need to delete links with the search method that's installed on Fanlore, you could just create a redirect. I'd have to look up the new search that's in test atm, though, and see how it interacts with that.)

also, it used to link every single instance of the article in one entry. Plus it didn't grasp compound articles -- so instead of [[Star Treck]] it would link [[Star]] [[Treck]] which...omg, SO nonsensical.

okay, let me look the new extension up, maybe it's improved these nightmarish qualities by now.
Wednesday, January 13th, 2010 10:35 pm (UTC)
sure, you just need someone to program that for you. Extensions are just readymade program modules that someone wrote for people who can't build it themselves. I'm pretty sure that a bot provides its own challenges/implications, though.

ETA: my comment below is a lot of technobabble, so to answer the pertinent bits -- yes, the extension does useful linking now, it should recognize and correctly link [[Sue Fangirl]] as well as [[The Sue Fangirl Club]].

What I meant by "non-destructive" is that if I understand "create link through pageview", the link is not altered in the code itself, so if an article disappears, no dead link should result. Which would be cool, but obv. you'd then load up the linking ...mechanism with each pageview. Ouch.

I may be understanding that wrong, though! I'm just a wiki dabbler and don't speak actual code.
Edited 2010-01-13 10:38 pm (UTC)
Wednesday, January 13th, 2010 10:20 pm (UTC)
hm, okay. Here's what the extension does:
text copied from extension page


* Turn ordinary words into page links, if the name matches. to match plurals and variations,
* write lots and lots of redirection pages.
[...]
* On 08/24/2007, a new version was created by [http://www.mediawiki.org/wiki/User:Vjg] which
* gives preference to longer page names. It switches the order of processing, handling the
* article for each page, but only building the page array once. The previous version handled
* the article once, but built the page array on each pass. I haven't fully explored all the
* performance implications. This was for use on a relatively small site.
*
* on 09/10/2008, a new version was created by Krishna Maheshwari
* which doesn't add a pipe & link if the page name matches the text (longest matches first). It also strips all links
* prior to regenerating them, ensuring if changes or other articles are created with longer names
* that match, than they are used. Ie: This Text could be transformed to This [[Text]] than on a
* subsequent edit become [[This Text]]. In addition, it only allows the first occurance of a page
* to be linked instead of each occurance. This extension is used on Hindupedia (www.hindupedia.com)
* New Tags: __NOAUTOLINK__ => this extension doesn't try to generate any links for page
* __NOREGENERATELINKS__ => don't try to regenerate links
* ... => don't generate links on content inside these tags
*
* On 12/25/2008, a new version was created by Hyunsik Kim
* completly rewritten version. Main function refer to the version 2.4
* magic word function has been removed because I don't know what it is.
* reduce regex functions and replace faster one, improve getting title function.
* consider nested tag structure.
* longest match uses first.


I think the crucial bits are a) not stable, b) performance may be affected [especially if I understand Pageview Autolink correctly as non-destructive linking, which would be vastly preferrable, but negative for performance] and c) may need to create tons of redirects. :/ (last one is just a pet peeve of mine)

I still have to think through the implications some more and look up the usage at hindupedia.

ETA: okay, sorry for throwing that brick of text at you, I've now highlighted a few crucial details.
Edited (added highlight/ wow, I suck.) 2010-01-13 10:24 pm (UTC)
Thursday, January 14th, 2010 01:14 am (UTC)
*wiki committee member*

I've used Autolinking in a wiki of my own, and oh God, it was painful. I disabled that extension within a couple of weeks and never looked back.

There are bots out there programmed to do some of the tedious tasks, but it would probably be useful to have people knowledgeable in Python before we attempted to do something? Wikipedia bots usually have at least one user each in charge of checking everything the bot does to ensure it's not malfunctioning.
Thursday, January 14th, 2010 10:27 am (UTC)
which version did you use, do you recall? Because the current one *seems* to have done away with most of the head-desky features. (am asking out of self-interest as well, boss-person wants auto-linlking bad, and I need to gauge if it's more work to talk them out of it or just install the damn thing. *sighs*)

if it's not feasible, should we collect a wishlist of tasks to automate and put up a call for python-able users?
Thursday, January 14th, 2010 12:10 pm (UTC)
I really don't—it's been a while. And your point about excessive redirects is valid; they're a pet peeve of mine, too.

(Incidentally, I'd kill to be paid to wiki. *envies*)

We'd need python- and wiki-savvy people—do you reckon there are many among our users? *sighs* I've used this bot for a lot of menial tasks, but I don't know how useful it would be at fanlore.
Thursday, January 14th, 2010 01:29 pm (UTC)
That's pretty easy to set up, but there would have to be someone in charge of each bot—just in case. I'll talk to Meri about this.
Thursday, January 14th, 2010 07:43 pm (UTC)
\o/
Wednesday, January 13th, 2010 11:23 pm (UTC)
If the bot worked well that would be awesome, but it sounds like maybe not so much? I don't actually type any brackets a lot of the time.

I search on Fanlore for the text I want to link, go down the return list and Ctrl click to open each page in a new tab. Go to the first tab and...

I open the page to edit, hit Ctrl + F type the text I want into the search box, click next, and then my browser highlights the text, and I click the Internal link button in the toolbar and the link is made.

After I close that tab, the search bar is still there with my text still in it, so it's click, click, click and on and on. It's actually more of a hassle saving the page, but my browser will autofill the comment box so that's partly automated too.

This obviously won't work as a shortcut method in every single circumstance. Sometimes you have to do some editing anyway, but in a lot of instances it is the fastest way I have found to make multiple wikilinks across multiple pages.

This might not work in every browser. I use Firefox for Ubuntu, but I recall IE highlighting in the same way.

If someone can't comfortably use a mouse this method is likely not any help at all.