Setting up a pool

  1. Filters
  2. Quality control
  3. Overlap
Tip. If you are looking for the answer to a specific question, use Ctrl+F to search the page (Cmd+F on MacOS).

Filters

Can I select performers from a specific city of residence or is the only option “Region by IP”?

Yes, you can do that. In the filters, select Profile → City. Please note that the profile data is entered by the user when they register in Toloka. We recommended that you use the filters Region by phone number and Region by IP.

Can I use a skill beyond a particular pool or project and apply it to other projects as well?

Yes, of course — you can use the same skill for different projects. But most often, a skill is intended for a specific project. If the performer completes a certain task well, this doesn't mean that they will complete other ones successfully. Another disadvantage is that if you filter by skills that were set long ago, you will artificially limit the number of available performers.

I want to calculate a skill based on performance in multiple projects. Is that possible? If it is, can I use “Aggregation by skill”?

If you mean multiple different projects, you can't do that.

You can merge all the projects into one and use History size in the quality control rules. See examples in the Control tasks post.

You can use Aggregation by skill, but you'll need to list all the possible values, which is probably not the best choice. Perhaps you'll find another method of aggregation helpful.

Why might aggregation of assignments by performer skill be unavailable? The pool doesn't use incremental relabeling. The output field is a variable with a Boolean value.

Perhaps the output fields you want to aggregate don't have valid values in your project. For now, you have to specify the possible values for every type of output fields.

I created a project and a pool, but the Next button doesn't work or the preview shows a blank screen.
Toloka lets you know that something is wrong with the project. The blank screen often appears when there are errors in the task interface, including the JavaScript code. The Next button may be disabled if the output specification lacks some field or contains invalid values, or if, for example, you configured validation for a nonexistent field in JavaScript.
Are there any easy ways to assign a certain user a skill in Toloka, even if the user didn't do any tasks (like I can do in the Sandbox)?

In the main Toloka version, you can only assign a skill to users who have completed at least one of your tasks. There is no option to assign a skill to an arbitrary user. To limit the audience of users who will see your project, use filters. For example, specify the city, date of birth, gender, or some other parameters of your target performers.

Can I somehow limit the number of users that can take tasks from the pool at the same time?

Tasks from an open pool are available to every user that matches your pool filters. You can restrict access, like by using a skill.

The performers completed training for the first pool and got the skill. A week later, we cloned the pool, but all the users lost their skill. Which parameter affects skill expiration? Do all the performers need to complete the training again?
The validity period of the training skills is controlled by the Retry after parameter. The skill is deleted after a period specified in days in the Retry after field, if the performer:
  • Has a skill value lower than the one specified in the Level required field.
  • Didn't complete any tasks linked to training during this period.

Your users will need to be trained again.

Why is my project not available in the mobile version of Toloka?

To make your task available in the mobile app, set up the filter: client = mobile Toloka in your pool.

Can I add an arbitrary user as a performer, if their rating is not high enough?

If the user mismatches your preset filter or rating level, they can't see the task. You can only remove the restricting filter from the pool. You can test the task in the Sandbox by adding the desired user to your trusted list.

Can I set up a task to display it to users with certain demographic and geo parameters? For example, “Moscow only, 30-45 years old”.

You can do that. To select performers for the pool, use filters.

How do I make the task available not only from desktops, but also from mobile devices?
To make your task also available in the mobile app, set up the following filter in your pool:
Client = web version
or
       = mobile Toloka
Can I select specific performers for my tasks because I liked their results in my previous pools?

You can assign a skill to these people based on their performance in the previous pools. Use this skill as a filter in the new pool.

How do I set up a filter so that the pool is available to users who don't have a specific skill (like a “spammer”)?

Specify this skill as a filter, but leave the value field empty (this is equivalent to absence of the skill).

How can I raise the skill value for a user, if they already have the skill?

If the user already has a given skill, you can't add the same skill to them from the task review interface. You can open the user's profile and edit the skill value.

Can Toloka users see that they were assigned a skill?

If it's a public or training skill, they see it and they get a message about it.

Can I show a skill in the task interface?

There is no such option. If the skill is public, the performer sees it in their profile.

Why can't I find the performer's gender in the user data, although I can filter people by this attribute in the pool settings?

Requesters can't see the full details about specific performers. So you can't see information like the date of birth, gender, last name, or first name. However, you can use filters by date of birth and gender (in the pool settings). This way you can select a group of performers without accessing the personal information of individual performers. This decreases the risk of user de-anonymization.

How do I automatically assign skills based on user responses to my questions?
You can do that using the Control tasks rule.
  1. Upload the task file using Smart mixing.
  2. Specify student as the correct answer to the question. Don't take other questions into account (leave the fields empty or unselected).
  3. Add the Control tasks rule to the pool: if the percentage of correct control answers = 100, then set the skill value Student = 1.
See the screenshot

Other questions

Quality control

How do I set quality control in a pool correctly?

The settings for quality control rules depend on the type of tasks. General recommendations:

  • Always use one or more ways to control quality of answers.

  • Counting fast responses makes sense for most tasks.

  • If the user has to choose between options (for example, by selecting checkboxes), check the answers using majority vote or control tasks.

  • If the user has to provide a response as a text or link or upload a photo, the best way to control quality is by reviewing assignments. You can outsource task acceptance to performers. Create a task with a question (for example, “Is this phrase translated correctly?”) and possible responses (for example, “yes”/“no”). Set up overlap and majority vote check.

  • If a task is more like an opinion poll (for example, choosing nice pictures from a set), majority vote is not a good way to control quality. Make control tasks with artificial examples where the choice is evident.

How many control tasks do I need to add?

We recommend adding at least 1% of control tasks in the pool. To filter out performers, use the Control tasks quality control rule. To rank performers by the quality of responses in control tasks, use a skill.

How are the correct responses to control questions counted?

The Control tasks rule starts working after the performer completes the number of control tasks you specified. If your pool contains both training and control tasks, you can take into account the responses in both of them (the Number of responses parameter) or only in control tasks (the Number of control responses parameter).

As soon as the needed number of responses is collected, Toloka calculates the percentage of correct and incorrect responses and performs an action (assigns a skill, or blocks the user in the pool or in the project). Then this percentage is updated as the tasks are completed by the performer. The number of the performer's last responses used for the calculation is set in the Recent values to use field. If you leave it empty, all the responses from the performer in the pool are counted.

Should I create a skill for every pool?

It is better to use one skill in a project. You can choose the way to calculate the skill:

  • Calculate the skill for each pool separately. The current skill value is the value of the skill in the pool the user completed last. This option is convenient if:

    • The pools are intended for different groups of performers (for example, there are filters by city or country).

    • Pools are started one by one and you don't want to take into account the responses in the previous pools to calculate the skill in the current pool.

    This calculation method is used by default when adding a quality control rule to a pool. For the control tasks block, leave the Recent values to use field empty.

  • Calculate skill based on all tasks in a project This option is good if the pools are small and you don't need to have skill calculated for each pool.

    This option is available only for skills on control tasks. To use it, fill in the Recent values to use field in quality control rules in pools.

Can I make my training or control tasks totally different from the main tasks?

Your training and control tasks have the same project specification. However, you can create a separate project with the tasks and assign a skill based on user responses. Then you can admit performers to the main project based on their skill.

Isn't the exam a regular pool that I pay for? How does it differ from a regular pool?

An exam pool contains only control tasks. It's usually small and used for checking how well users learned to do your tasks after they read the instructions and completed the training. Unlike your main pool, you already know the correct responses for every task in this pool. You can set the price to zero.

Based on the results of responses to control tasks, you can assign a skill to the users and then specify it in the main pool as a filter. For example, MySkill = 80 or = Is missing. You don't have to create an exam. For simple tasks, the training pool provides enough practice, but many requesters also use exams.

Is the time specified per task suite in the fast response settings?

Yes, the fast response settings specify the time per task suite.

I set up quality control, then I copied my user requirements. All my quality control settings were deleted and replaced with the copied settings. Is that normal?

Yes. When you copy the filter and quality control settings, the settings you previously added manually are overwritten. You should see a warning about this in the copy settings window.

I set up a rule to ban users after the first incorrect captcha. This is to eliminate any bots. Is this too strict? What rule do most projects use?

Indeed, this rule is probably too strict. Even the most careful user can make a mistake, so you probably want to relax the rule. Besides the requester-specific bans, we have system processes that ban users who regularly fail captcha checks in Toloka.

The pool has an overlap and majority vote set up, but some fraudulent performer opens the task suites, does nothing, and submits empty assignments. Could this cheater get more tasks from the pool before the results of other performers are known? Could the cheater quickly click through a lot of task suites before the majority vote is accumulated to ban the cheater?

Yes, unfortunately, this can happen. This is why we recommend that you offer a training task or exam before the main task. In this case, only those people who showed good performance at the previous stage are selected for the main pool.

How do I set up an exam so that different people can take it without running out of tasks?

When you load tasks, use smart mixing. In this case, you'll have infinite overlap in your exam.

However, this poses the risk that you might spend a lot of money on the exam. You might want to open this pool only when the main pool opens, and close it when labeling of the main pool ends.

How do I test users to determine which kinds of tasks they do better and assign them relevant tasks? I don't want testing to affect the performer rating negatively.

You can add a training pool to test your performers. Based on the test results, assign skills to the users for the tasks they do best.

Then open your pools only to the users that have a certain skill: use filters for this.

This won't lower performer ratings. Even if you ban users from your project based on the testing results, this won't affect their rating.

If I upload tasks using smart mixing, does it mean that the same file should contain both the control tasks and main tasks?

You can upload your main and control tasks separately using different files.

If a cheating performer gives a lot of incorrect responses, and the system eventually bans them for errors in control tasks, do I have to pay for the bad responses anyway?

If the user already got paid for the tasks, the money can't be refunded to you.

Can I control the frequency of showing captchas to the performers? Some performers get a bit demotivated by that.
The frequency of issuing captchas is set up in the pool.
No
Don't show captchas.
Low
Show a captcha after every 20 task suites.
Average/High
Show a captcha after every 10 task suites.
Can one performer get access to two pools in the same project? Can I avoid that?

Yes, if they can access both pools, they can do both of them. To restrict access to subsequent tasks for a performer, use the Completed tasks rule and select a ban at the project level.

If I ban a performer for doing my tasks too fast, will all their responses be deleted and given to other performers for labeling?

No. The responses of these performers aren't automatically excluded from the final results file.

But you can do it yourself if you want. When downloading the results, select the option Exclude assignments by banned users. You can also forward all the assignments from banned users to other performers using a rule.

Can I create two active training pools, one for practice and the other for admitting users to the main pool? In other words, one pool is for users to practice and the other pool tests them.

Yes, you can do that. In this case, create the first pool based on the training pool and the exam pool based on your main pool. If a pool contains only control and/or training tasks, the price can be set to zero.

In the exam pool, you can create a skill reflecting the exam result and granting admission to the main pool. For example, if the number of responses is ≥ 10, set the skill value in the <exam skill> as % of correct responses. In your exam pool user requirements, specify: <exam skill> < 80 or = Is missing>. In the main pool, set up a filter: <exam skill> >= 80 and (<main skill> >= 70 or = Is missing). You can choose the skill values depending on how well the performers handle your task.

Can I get more details on the best practices for using captchas? For which projects is it better to use captchas and how often?

Captcha is usually used in simple projects with automatic acceptance, like classification, categorization, or information search. These are cases where there are few response options and users don't need to upload files or write texts. It helps you filter out bots and sloppy performers.

The frequency of issuing captchas is configured in the pool.
No
Don't show captchas.
Low
Show a captcha after every 20 task suites.
Average/High
Show a captcha after every 10 task suites.
I found the following terms related to captcha in Help: “Percentage of correct responses” and “Percentage of incorrect responses”. Are they determined from the control sample?

The percentage of correct responses is based on the total number of captchas processed by the performer within the “range” specified in the Recent values to use field. If the value is empty, the percentage is calculated using all the captchas that are shown for the tasks in the pool which uses the captcha rule.

My task uses a form with multiple fields. When there is an overlap and “Majority vote” is used for quality control, is each field taken into account, or if one field mismatches the majority vote, are the task results considered incorrect?

All responses to the task are taken into account. If one response differs from the majority vote, the whole task is counted as mismatching the responses of other performers.

Have I understood correctly that if I use set the the skill value = 1 with the percentage of accepted responses >= 75 and 10 recent values to use, for every 8 correctly completed tasks out of 10 the user is given 1 skill point?

No, this is incorrect. With these settings, each time a rule condition is met, the performer gets skill = 1. To change the skill value in the process of task review, you need a “multi-step” rule, which has multiple identical rules with different values of Total reviewed responses.

I created a training pool with one task containing a hint. The user fails to complete the task on the first attempt, but finally succeeds. The user gets the skill 0. How do I grant to the user access to my tasks? The minimum required level that you can set is 10.

Technically, if you have only one task in your training pool, you don't have this option. The skill will be either 0 or 100. We recommend that you add several tasks, or at least 2 so that the performer will practice on the first task and will be able to do the second task correctly. In this case, you can admit users to your main pool starting from the skill value of 50.

You can also create a training pool based on the main pool. Assign a skill using the Control tasks rule: in this case, you can admit users with any skill level to your main pool, even if the value is zero. But we don't advise giving tasks to people who failed training.

Can I use non-automatic acceptance in the training pool?

No. But you can create a pool of the Training type based on your main ppol and enable non-automatic acceptance there.

Can the performers see which questions are control tasks?

No, they can't.

I have two text versions that I want to show to my respondents: one version to half of the audience, and another version to the other half (like in A/B testing). Is it possible to do this in Toloka, or do I need to create two separate projects?

If you pass texts to the input data, you can load 2 different tasks in the pool. In one task, pass Text 1 in the INPUT: <input field name> field, and in the other task, use this field to pass Text 2. But if the text is in the HTML block of the task template, you need to clone the project. To let a performer do only one task in your project, use the Submitted responses rule. You can assign a skill or ban the performer after they submit one response.

If I ban users from my project so that everyone can complete a maximum of one task, are the users notified of the ban?

No, the users are unaware of the ban.

When I export a project from the Sandbox, the task files are not exported. Is this how it's supposed to work? I suddenly lost the markup of the control tasks that I created in the sandbox.

The tasks themselves are not exported, only the project configuration and the settings of the selected pool. However, you can download your marked up tasks from the Sandbox pool and import them to the pool you created. To download the control tasks only (if you marked them up in the interface), go to Mark up, then click Control tasks and Download.

I want to create an exam with three tasks. If a user does two out of three tasks correctly, they get the skill. So I try to use 3 in the “Recent values to use” field, but I get an error that the value is too small. Can I get around this without increasing the number of tasks to five?

The Recent values to use field is for the number of recent responses from the performer. If you use non-automatic acceptance for your task, then to set up your intended rule you need to specify 3 in Total reviewed responses.

What output format do I use for the review results to filter out mismatching users based on the “Majority vote”?

To perform actions with users (assign a skill or ban them) based on the majority vote, add a relevant rule to the pool.

Don't forget to enable Keep task order in the pool parameters. Majority vote is used in the projects with preset options (radio buttons or checkboxes). This rule won't apply to the text entry or file upload fields.

I want to create training and exam pools to match the entered text against a sample, and sometimes the matching fails. How do I implement this?

For a control or training assignment to be counted as correct, it must exactly match the control assignment. To do this, you need to normalize the response text using JavaScript: remove spaces, punctuation marks, special characters, and capital letters, and write the result in a separate output field. Now you can match the processed assignment text against your control text.

Another option to select performers for this type of projects is non-automatic acceptance.

In the section about control questions, does "Number of control responses" mean the total number of responses to control questions (including incorrect responses) or the number of correct responses to my control questions?

This is the total number of responses to the control questions.

How do I classify the users as good performers and poor performers as they complete the tasks and ban the poor performers without affecting their rating?

You can create a task pool for all your performers and create performer skills in it. In this case, you can open your tasks only to the performers with the necessary skills. This won't affect their rating.

Even if you ban a performer from the project, this won't affect their rating either.

Why has the speed of pool completion dropped?
Possible reasons:
  • You've stopped the main pool. This could limit the number of performers with access to the pool. Start the training pool again. There will be more performers who can access the tasks.

  • The filters you set are too strict. For example, a strong restriction on a certain skill that most users don't have.
  • Too many users are banned. Ease the quality control rules.
How can I speed up the pool completion?

Other questions

Overlap

What overlap should I set?

Overlap defines how many performers complete the same pool task.

The best overlap is an overlap that provides satisfying quality of results. For most tasks that are not reviewed, overlap from “3” to “5” is enough. If the tasks are simple, overlap of “3” is likely to be enough. For tasks that are reviewed, set overlap to “1”.

Can I change overlap after the pool is started?

Yes. Open edit mode for the pool and set a new overlap value. You don't need to restart the pool. Updating the settings is usually fast, but if there are many tasks, it may take several minutes.

Can it happen with incremental relabeling that the pool closes before the tasks for minimal overlap run out? The overlap increased, and the pool closed, and I need to start it manually.

Yes, this might happen. You must set an adequate pool closing interval.

How does counting work if I set overlap = 3 in the pool and response threshold = 3 in the majority vote?

In this case, if you don't have 3 identical responses for your task (response threshold), no user would be considered a good or poor performer, because the system can't see which of the users made an error.

But if you set response threshold = 2 with overlap = 3, then two users with the same responses are considered good performers, but the third user, who gives a different response, is a poor performer.

Can I do it like this: set a basic overlap of 2 users, then, if both performers select the same response, close the pool, but if they give different responses, show the task to one more user?

Yes, you can do that. Set up incremental relabeling.

Is there a cross-check feature for tasks?

You can use overlap to let multiple performers do the same task. The overlap value is set up in the pool settings.

Why is the maximum number of completed tasks in the progress bar less than the total number of uploaded tasks?

The progress bar shows the number of task suites including the overlap. If the overlap is greater than one, the number of task suites is different from the total number of tasks.

Other questions