Idea to remove spam and eliminate the need for registration.

All about the OSDev Wiki. Discussions about the organization and general structure of articles and how to use the wiki. Request changes here if you don't know how to use the wiki.
Android Mouse
Member
Member
Posts: 28
Joined: Fri Feb 02, 2007 10:36 pm

Idea to remove spam and eliminate the need for registration.

Post by Android Mouse »

All that would be needed is to change the html names for the form inputs. For example, the default textarea input name when editing is 'wpTextbox1', change this to anything else and bots wouldn't be able to edit.

I personally don't think requiring registration is in the spirit of a wiki. Some people may find a small error but don't want to go through the hassle of registering and as a result move on and leave the error.

Adding a captcha to those not logged in would also work too. But I think just changing the input names would be easier for everyone.

An idea to consider perhaps.
User avatar
Alboin
Member
Member
Posts: 1466
Joined: Thu Jan 04, 2007 3:29 pm
Location: Noricum and Pannonia

Post by Alboin »

I agree. I think something like Wikipedia has, where all I have to do is create a user name and password, and BAM! I'm in. I think having to register for the forum is tiresome for just wanting to edit the wiki. Although, I have never taken care of anything like this before, so I may not be considering in certain factors. :wink:
C8H10N4O2 | #446691 | Trust the nodes.
User avatar
Combuster
Member
Member
Posts: 9301
Joined: Wed Oct 18, 2006 3:45 am
Libera.chat IRC: [com]buster
Location: On the balcony, where I can actually keep 1½m distance
Contact:

Post by Combuster »

Chase once said (I cant find the thread otherwise i'd post a link) that he already changed the form names for phpBB and still get spam. I doubt mediawiki bots would not contain such intelligence somewhere.

To fix an misunderstanding here: Captcha's in their current form are usually broken.

In total, each defensive measure will filter only a percentage of bots, and ATM the current setup seems to be 100% accurate. In short: "Don't change the winning team". (From my own webmaster experience, even the smallest amount of spam is annoying)

Besides, registering at phpbb isnt such a big difference compared to registering at mediawiki.

But since i'm not the admin, I'll have to wait for Justice Chase to speak out the final verdict
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]
Android Mouse
Member
Member
Posts: 28
Joined: Fri Feb 02, 2007 10:36 pm

Post by Android Mouse »

To fix an misunderstanding here: Captcha's in their current form are usually broken.
True, but my point was anything non-default as a requirment for editing will totally through the bots.

For example, on the editing page it would be easy to add another required input field and require users to enter in the url of the current page they are on, ex:
"Copy + paste the url in your address bar here: [input box]".

Bots wouldn't be able to do this, unless they were custom written for this specific site, which they won't and is unlikely anyways. The added input field would only be needed for those not logged in of course.
User avatar
spix
Member
Member
Posts: 128
Joined: Mon Jun 26, 2006 8:41 am
Location: Millicent, South Australia
Contact:

Post by spix »

Bots wouldn't be able to do this, unless they were custom written for this specific site, which they won't and is unlikely anyways. The added input field would only be needed for those not logged in of course.
What makes you so sure? Have you studied these bots? or are you just making assumptions?

I wouldn't automatically assume people who write spam bots are stupid.
SpooK
Member
Member
Posts: 260
Joined: Sun Jun 18, 2006 7:21 pm

Post by SpooK »

... or you could integrate the Wiki into the forum ;)
Android Mouse
Member
Member
Posts: 28
Joined: Fri Feb 02, 2007 10:36 pm

Post by Android Mouse »

What makes you so sure? Have you studied these bots? or are you just making assumptions?

I wouldn't automatically assume people who write spam bots are stupid.
No, but I doubt bots today will be passing the turning test anytime soon. :wink:

So it is unlikely they will be capable of comprehending what the added text input field is for, much less be able to fill it out correctly.
User avatar
Brynet-Inc
Member
Member
Posts: 2426
Joined: Tue Oct 17, 2006 9:29 pm
Libera.chat IRC: brynet
Location: Canada
Contact:

Post by Brynet-Inc »

Android Mouse wrote:No, but I doubt bots today will be passing the turning test anytime soon. :wink:

So it is unlikely they will be capable of comprehending what the added text input field is for, much less be able to fill it out correctly.
We are borg!! We will adapt!! 8)

Hehe, Sorry.. :lol:

Anyway, Self-modifying code is possible.. But for these bots to adapt quickly to random modification's without human interaction does not seem possible yet..

I myself don't really think these people are actually writing bots for osdev alone....

But who knows.. They might be attempting to conqure the world with Viagra advertisements ;)
Image
Twitter: @canadianbryan. Award by smcerm, I stole it. Original was larger.
User avatar
spix
Member
Member
Posts: 128
Joined: Mon Jun 26, 2006 8:41 am
Location: Millicent, South Australia
Contact:

Post by spix »

So it is unlikely they will be capable of comprehending what the added text input field is for, much less be able to fill it out correctly.
You are assuming that spam bots use the "wpTextbox1" to identify what textarea to put their spam into.

A bot could deduce which textarea to put spam into the same way a human deduces which textarea to put content into.

More likely, it puts spam into every textarea it finds, some work some dont. Then the bot would work with any website with dynamic content and not just media wiki.
User avatar
Alboin
Member
Member
Posts: 1466
Joined: Thu Jan 04, 2007 3:29 pm
Location: Noricum and Pannonia

Post by Alboin »

Apparently you have to be an expert in AI to create a good spambot. Hmm.. I can picture it now, spam bots written in Lisp. :) (Although, I have read of people using Lisp as a server side language...)
C8H10N4O2 | #446691 | Trust the nodes.
User avatar
ucosty
Member
Member
Posts: 271
Joined: Tue Aug 08, 2006 7:43 am
Location: Sydney, Australia

Post by ucosty »

spix wrote:
So it is unlikely they will be capable of comprehending what the added text input field is for, much less be able to fill it out correctly.
You are assuming that spam bots use the "wpTextbox1" to identify what textarea to put their spam into.

A bot could deduce which textarea to put spam into the same way a human deduces which textarea to put content into.

More likely, it puts spam into every textarea it finds, some work some dont. Then the bot would work with any website with dynamic content and not just media wiki.
What about a text box that if it contains text, you know it's a bot. Labeled something like "Don't put text in me, or else".

edit: Scratch that, I have no idea what I was thinking. I had a long day at work. :x
The cake is a lie | rackbits.com
User avatar
bubach
Member
Member
Posts: 1223
Joined: Sat Oct 23, 2004 11:00 pm
Location: Sweden
Contact:

Post by bubach »

I have seen many diffrent types, like having a checked checkbox with the text "I'm a dirty spmabot, please ignore this post" and also one where all the form names is random each time, with a hidden field that explains how to rename them back to the real name.
"Simplicity is the ultimate sophistication."
http://bos.asmhackers.net/ - GitHub
User avatar
Kevin McGuire
Member
Member
Posts: 843
Joined: Tue Nov 09, 2004 12:00 am
Location: United States
Contact:

spam filtering idea

Post by Kevin McGuire »

What about twenty pictures each of a major scene, then have the user type what the scene contains. Like have a scene with planets orbiting in the solar system. Then certain words associated with the picture like:

Code: Select all

earth
planet
planets
world
universe
Better, have each word be able to have a little incorrect spelling like a certain percentage wrong in relation to the word size. At the same time have someone just check over the log of words that failed to hopefully find ones that make sense but we forgot to add.

If the user can not get it right let them send a special registration internet message which gets posted in the forum which only users who have made so many posts could make a post with the word "accepted" in and have the registration pass just for a double fault for people who might have a hard time at riddles? We should have a nice user base that is almost constantly online to make a quick reply instead of just the administrator and moderators.

-Sub Forum: Registration
-The Threads Are Registrations
Once a registration is accepted the thread is automatically removed so we do not end up with a flooded sub forum.

Even better have the failed words appear in the register sub forum thread for that registration attempt based on the internet mail address. So the users here who have over a certain number of posts can help add valid words for each picture.
User avatar
~
Member
Member
Posts: 1227
Joined: Tue Mar 06, 2007 11:17 am
Libera.chat IRC: ArcheFire

Post by ~ »

Does it mean that they would be able to figure out what are the fields for even if they randomly change anytime with names like kiarkt, tialit, and there are like 100 invalid fields?

For making it harder, they could be positioned using CSS in such a way that only valid fields are seen by the user. The interface could also be generated using Javascript, so those programs will need to be able to interpret both Javascript and CSS correctly with rather complex and "self-modifying" algorithms to calculate the layout and naming of fields (keeping track of valid ones using a session cookie, so it would also need to interpret and keep cookies). If invalid fields are filled, then the sent content would be rejected.

----------------------------------------
Another thing that I have said has been to look for anomalous web addresses in the profiles, anomalous or unrelated words, and check for the posts of such users. If they are found to be fake, they could be deleted automatically (user and posts) without human intervention, but that would require a webcrawler through Google and Yahoo search to find for such messages, that would be around 70% of the time. They could be prevented from posting in General Ramblings for say before 15 useful, well-crafted and coherent posts or otherwise be fooled in thinking that they were successful sending garbage.

Of course all that would require a considerable amount of programming hours...
User avatar
Candy
Member
Member
Posts: 3882
Joined: Tue Oct 17, 2006 11:33 pm
Location: Eindhoven

Post by Candy »

Most of the "check if human" checks proposed are things computers can do, if only because they use a certain determination humans aren't that good at either. Which of these three is a planet? The one with one object on a black background, of course. Which is earth? Check the colors. Statistics could solve that.

You need to do something humans are good at, something that's kind of subjective. The best theoretical test I've seen for the purpose is the "kitty"-test, in which you get 9 images, 3 of which are kittens and the other six are something else which looks a bit like one, but are clearly no kittens (dogs, birds, mangled images of something that looks like a cat). Computers have no definition for "kitten" especially since it isn't a clear-cut decision.

A moment of thought lead me to determine other things, depending on the audience. This audience isn't that good a match, but for instance, for a car audience you could take cars with their badges removed and ask the brand of the car.



Most importantly, however, is that none of these measures may be widely adopted (!). If something is widely adopted, it becomes humanly feasible to write a bot for it in order to make profit. On the other hand, you can make something computationally infeasible, by determining something a human is really really fast at and a computer is really really bad at. That's not OCR or such.
Post Reply