Adding tasks to a pool

  1. TSV file
  2. Using files from cloud storage
  3. Training-exam-retry
  4. Changing a running pool
Tip. If you are looking for the answer to a specific question, use Ctrl+F to search the page (Cmd+F on MacOS).

TSV file

How many tasks should be in a suite?

The number of tasks depends on how difficult and time-consuming the tasks are. Keep the size reasonably small. Large task suites are unpopular, partly because they are inconvenient for performers (for example, if the internet connection is unstable).

What is the right time limit for the task completion?
Try completing the tasks yourself. Ask your colleagues and friends to complete them. Find out average completion time and add 50% to it.
How do I know how many tasks a performer will see on the page?

You can specify the number of tasks on the page when you upload your tasks to the pool. For more information about distributing tasks across pages, see this article.

When I generate a TSV file with links to images on Yandex.Disk, the images are not displayed. Why?

You can read about connecting Yandex.Disk here.

The project template must contain something like this:

<img src={{proxy img}} width="400">, where img is an input field in the string format.

Use the example.jpg file for testing. You can find its URL under Profile → External Services Integration.

Why does the preview display all the photos from the TSV file at once?

You must use a separate row for each task in your TSV file. For more information, see here.

When you create a pool, the pool will have settings for the number of tasks per page.

The system interprets commas inside my array elements as separators between the array elements. How do I avoid this?

Escape commas with a backslash (\).

How do I upload the file with the accepted assignments back to Toloka for projects with non-automatic acceptance? Where do I find the format of the upload data?

Use the button Upload review results to upload your file. You can see the format here.

Assignments are reviewed in a TSV file.

How is the data from the "hint" column displayed?

The hint column should be filled out for your training tasks. When creating a main task, you only need to fill out the input fields. Omit the other fields or delete them along with their headers.

The file structure and how to fill it out is described here.

What do the lines "Add your text here" mean?

"Add your text here" is a hint for you. It means that you can replace the text in the field with your task data. The file structure and how to fill it out is described here.

Why do double quotes disappear from the output if I try to escape them using quotation marks?

If you have one word enclosed in quotes, format the uploaded assignment like this: "How many letters are there in the word ""Liechtenstein""". If you are escaping quotes inside your text, then the entire text must be enclosed in quotes. For more information, see the Guide.

Why haven't I received assignments since I launched my first project, and all the uploaded assignments are marked as "Training"?

Check the hint field. For the main tasks, this field must be empty.

How do I create the task file properly so that there are no errors?

In the file with the main tasks, the columns with the INPUT headers must be filled out. You can see those headers if you download a sample file from the pool.

If you are creating control tasks, fill out the GOLDEN columns with the correct responses.

If you are creating a training task, you also need to fill in the HINT:text column. For the main tasks you don't need any columns other than INPUT, so feel free to delete them.

The file format must be TSV, and the encoding must be UTF-8.

For more information about creating the file, see the Guide. If there are errors during the upload, look up the error description on this page.

Why do I see a syntax error when I upload a task where a user has to view an image and write feedback?

The error might occur if the expected input type is URL, but a string is received.

There may be two reasons:
  • The input field has the "link" type.
  • The pool was created for an outdated project version. It means that the pool was created before you changed the input field type.
What is the maximum number of tasks per page?

It depends on the task. Technically, you can use as many tasks you want.

But users are reluctant to take lengthy tasks. They'd rather do 10 tasks that take one minute each than one task that takes 10 minutes.

In addition, if you use a large number of tasks on the page, there might be issues with uploading the files to be labeled. This problem might occur with images.

The third thing to consider is quality control and assignment review. If you use recompletion of assignments from banned users, you should split the task into smaller parts so that fewer assignments are recompleted. You are more likely to meet your budget this way.

I have a task for photo classification. When there are more than 5 photos on the page, why does Toloka split them across 2 pages?

Toloka will split the links to images in the uploaded file into pages depending on the method you specified when uploading the TSV file. For more information about the three upload methods, see the Guide.

Are TSV files sensitive to the order of the INPUT field and GOLDEN fields?

TSV files are insensitive to the order of fields. Use your preferred order of fields.

How do I add multiple "known_solutions" to a TSV file with a training task?

You can't use the interface to upload the tasks with multiple correct responses to the pool. You can only use the API for that.

Where is my TSV file added if I upload it to the running pool?

If you have the Keep task order option enabled, labeling will start after the previously uploaded tasks are taken by users. If this option is disabled, we can't guarantee that the tasks are assigned in their sequence order.

How do I write an array to an input TSV file?

The array of strings in the input data must be comma-separated. For example: INPUT:typestext1, text2, text3, text4

How do I properly structure my TSV file used for data upload if there is JSON data among the input?

All the values are written to the same column. Make sure to escape quotes. For more information about escaping quotes in JSON format, see the Guide.

If there are no headers for some input columns in the TSV file, are they going to be skipped during import? Will they be skipped if they have headers without the "INPUT:.." prefix?

No. If you try to upload a file with missing headers to the pool, the system issues an upload error. All the INPUT fields required in the specification must be present in the TSV file with tasks. There must be no extra fields or columns.

If you don't want to show some data to performers, but you still need this data in the file, create the optional hidden input fields for such data in the project.

How do I insert a link in the GOLDEN field?

Text in the GOLDEN field must match the control text exactly.

Usually, if you copy site links from the browser, the copied links have the same format. But this is not the case when the link is trimmed or typed manually.

Check the links that you use. There are several ways to unify links:
  • Add requirements for the link format in your instructions and hints in your training pool.
  • Use RegExp in your JS to trim the received links and write the result to the new output field, and then match the received value against the control value.
How do I specify smart mixing settings in the interface when uploading a file?

Smart mixing settings are specified for the file rather than for the pool.

The settings specified during the first file upload are applied to all the files that are uploaded to this pool later on.

How do I properly structure my TSV file used for data upload if there is JSON data among the input?

All the values are written to the same column. Make sure to escape quotes.

For more information about escaping quotes in JSON format, see the Guide.

What is the difference between "task" and "task_suite"?

A task means a separate task. A task suite means a page with tasks. The performer gets paid for a task suite.

Errors when uploading tasks in the pool
How do I view the processing log?
To view the processing log, click More on uploading errors. The processing log is written in JSON format. Objects inside result match the line number of the uploaded file. Lines that were processed with an error have the status "success": false.
Tip. To work with a large log conveniently, copy it to the text editor.
Errors in column headers

If the column headings are incorrect, the whole file is rejected. Otherwise, Toloka specifies the number of tasks with processing errors.

Processing errors table
Overview How to fix
"parsing_error_of": "https://tlk.s3.yandex.net/wsdm2020/photos/2d5f63a3184919ce7e3e7068cf93da4b.jpg\t\t",
"exception_msg": "the nameMapping array and the sourceList should be the same size (nameMapping length = 1, sourceList size = 3)"

Extra tabs.

If the TSV file contains more \t column separators after the data or the link than the number of columns set in the input data, you will get en error message.

For example, if 1 column is defined in the input, and two more \t\t tabs are added in the TSV file after the link, you get 3 columns, 2 of which are extra.

Remove extra column separators in the above example — both \t\t characters.

"exception_msg": "the nameMapping array and the sourceList should be the same size (nameMapping length = 4, sourceList size = 6)"

The number of fields in the header and in the row doesn't match.

Make sure that:

  • The number of tabs in the file structure is correct.
  • String values with tab characters are enclosed in quotation marks " ".
"code": "VALUE_REQUIRED", "message": "Value must be present and not equal to null"
The value is missing for a required input field.

Make sure that columns with required input data fields are filled.

"code": "INVALID_URL_SYNTAX", "message": "Value must be in valid url format"
Invalid data in the “”“URL” field.
Make sure that:
"exception_msg": "unexpected end of file while reading quoted column beginning on line 2 and ending on line 4"

Unpaired quotation mark in a string.

Check that all quotation marks are escaped.

The same task appeared on different pages

The same task may appear on different pages if:

  • The project uses incremental relabeling. As an example, let's say there were 5 tasks on a page. For 4 of them, responses coincided and the common response was counted as correct. The fifth task was mixed into another set because it didn't get into the final response and it needs to be “reassessed”.
  • Different tasks have different overlap. Tasks with higher overlap will be additionally shown in sets with the other remaining tasks in the pool.
  • If a quality control rule changes a task's overlap, it will appear in a different set.

Other questions

Using files from cloud storage

I can't upload files from Yandex.Disk

If images, audio or video from Yandex.Disk don't appear in the instructions or on the task suite, make sure you connected Yandex.Disk correctly and uploaded the files.

How to create a task where the performer has to view a video from Yandex.Disk

To create such a task, take the video markup template as a basis.

To host your videos on Yandex.Disk, connect Yandex.Disk and set up the project.

Why can't my task for selecting objects in an image display images from Yandex.Disk?
The problem is in your task template. Make sure that:
  • In the project, the input field where you pass the file link has the “string” type.
  • The component in the task template uses the "proxy" expression.
  • The format of relative links in the TSV file with tasks is correct: <unique name>/<file path and name>.
For detailed instructions and videos, see the page Using files from Yandex.Disk.
Frequent mistakes when connecting to Yandex.Disk and uploading files
  • The Input data field in the project settings has the link type. You should choose the string type.
  • The TSV file contains absolute references to the task files. You need to insert a link <unique name>/<path and file name>. For example: yadisk/image1.jpg or yadisk/photos/image1.png.
  • Photos from Yandex.Disk are used in the task instructions in the mobile app. To display the photos in the instructions, use only direct links.
  • Files are deleted or aren't located in the Yandex.Disk folder that the link leads to.
  • The OAuth token isn't active. Update the token on the External Services Integration page.
To display files from Yandex.Disk (images, audio files, videos) to the performer:
  1. Link Yandex.Disk in your profile.
  2. Set the string type for the input data field.
  3. Insert a file link using the proxy component.

Detailed instructions

Files load too slowly from Yandex.Disk. How do I speed up the loading process?

Try the recommendations on this page or contact Yandex.Disk support.

How do I embed multiple images from Yandex.Disk?

To add images using links to Yandex.Disk, use the link format: /api/proxy/proxy name/path to image.

In the requester profile settings, under External Services Integration → Proxy settings, set up integration with external services. For more information, see this page.

Why doesn't the task preview show my images from Yandex.Disk?
The problem is in your task template. Make sure that:
  • In the project, the input field where you pass the file link has the “string” type.
  • The component in the task template uses the "proxy" expression.
  • The format of relative links in the TSV file with tasks is correct: <unique name>/<file path and name>.

Detailed instructions.

How do I add a video hosted on Yandex.Disk to my task?

You can base it on the video markup template.

To host your videos on Yandex.Disk, connect Yandex.Disk and set up the project.

Other questions

Training-exam-retry

How do I precede my task with mandatory control questions to check that the user understood my instructions? Would such training or control tasks be similar to the main tasks?

The training and control questions must meet your project specification. However, you can create a separate project with your instructions, survey, and sample videos. Then you can assign a skill to users based on their responses. You can use this skill to admit performers to the main project.

More performers were trained than the training skill shows

The pool shows the total number of performers that completed at least one task suite. A training skill can be lost over time if you set repeated training in the pool settings. This setting allows a performer to pass the training again after a certain period if the performer didn't complete any tasks in associated pools or if there was a large time gap between completing tasks (for example, because of the ban). The training skill displays the performers who either recently completed training, or regularly complete your tasks so that the skill doesn't expire.

What's the difference between the exam pool that I pay for and the main pool?

An exam pool contains only control tasks. Usually it's small and intended to check how users learned to do your tasks after they read the instructions and completed the training.

Unlike your main pool, you already know the correct responses for every task in this pool. You can set the price to zero. Based on the results of responses to control tasks, you can assign a skill to the users and then specify it in the main pool as a filter. For example, = 80 or = Is missing>. You don't have to create an exam, because the training pool provides enough practice for simple tasks. But many requesters also use exams.

Which parameter affects the skill expiration?

The validity period of the training skills is controlled by the Retry after parameter.

The skill is deleted in the specified number of days if the performer:
  • Has a skill value lower than in the Level required field.
  • Didn't complete any tasks linked to the training during this period.
If their skill expires, your users need to complete the training again.
How do I make one parameter mandatory and the other parameters optional in my training task?

In the task file, leave empty control values for the optional output data.

How do I know when a particular performer got the skill?
  1. Go to the user card.
  2. Click the Profile tab.
  3. Find the required skill in the list and download the history of its changes.
Why do I have an infinite number of pages in the training pool?

Tasks have infinite overlap in the training pool. As long as the training pool is open and the training is running, users can access the tasks. Learn more about training pools.

How do I insert a link in the GOLDEN field?

Text in the GOLDEN field must match the control text exactly.

Usually, if you copy site links from the browser, the copied links have the same format. But this is not the case when the link is trimmed or typed manually.

Check the links that you use. There are several ways to unify links:
  • Add requirements for the link format in your instructions and hints in your training pool.
  • Use RegExp in your JS to trim the received links and write the result to the new output field, and then match the received value against the control value.
How do I use smart mixing to upload my main tasks separately from control tasks?

Smart mixing is set up when you upload tasks to the pool. After creating a pool, click Upload and select the method for generating task suites. You can upload them using separate files or one file, arranging them in any order.

Can I automatically pause accepting applications for the training pool if the necessary number of performers have been trained and are already doing the tasks?

You can close the pool manually at any time using the interface. However, you can't set the number of users that should complete the training pool for it to close automatically.

How do I check that the performers don't cheat during training?

Training helps users learn how to complete your task and figure out the instructions.

Based on the training results, you can select performers who did well enough for the main pool.

However, the mere fact that a performer completes your training pool successfully doesn't guarantee that they will afterwards demonstrate high quality on your main tasks. Performers who show a high level of accuracy during the training could have obtained correct responses from others.

Besides the training, be sure to add quality control rules and control tasks to your main pools. This way you can ensure the quality throughout the task performance process.

If the task requires that the users send free-format responses or data files, use non-automatic acceptance to pay for tasks after they are reviewed.

Why does the training pool allow smart mixing but doesn't allow adding by empty row?

This is a technical peculiarity of training pools. You can only upload tasks to your training pools this way. If you want to upload tasks to the training pool suite-by-suite, create the main pool, set the pool type to Training, and set the price to zero.

How do I create two active training pools: the first one for practice and the second one to admit the users to the main pool?

Create the first pool based on the training pool and the second pool based on the main pool with the pool type set to Exam. If a pool contains only control and/or training tasks, the price can be set to zero.

In the exam pool, you can create a skill reflecting the exam result and granting admission to the main pool. For example, If the number of responses is ≥ 10, set the skill value in the <exam skill> as % of correct responses.

In your exam pool user requirements, specify: <exam skill> < 80 or = Is missing>.

In the main pool, set up a filter: <exam skill> >= 80 and <main skill> >= 70 or = Is missing>. You can choose the skill values depending on how well the performers handle your task.

How do I create a training task so that the performer might fail it, but still be admitted to the main task pool?

Technically, if you have only one task in your training pool, you don't have this option. After the user completes the training task, their skill will be either 0 or 100.

We recommend that you add at least 2 steps: the performer will have enough practice with the first task to do the second task correctly. In this case, you can admit users to your main pool starting from the skill value = 50.

You can also create a training pool based on the main pool. To assign a skill, use the Control tasks rule. In this case, you can admit users with any skill level to your main pool, even if their skill is 0. But we don't advise giving tasks to people who failed training.

Do users have to complete all the tasks in the training pool?

If you enabled incomplete training and specified the number of training pages required, users don't have to fully complete the training in order to pass. If such settings are disabled, the users have to complete all the tasks in the training pool to get a training skill.

How do I set up a retry pool for my project?

You can create a retry pool similarly to an exam pool. In the pool settings, select the type Retry. In the retry pool filters, specify the upper and lower values of the <main skill> that the users must get in order to be admitted to the retry pool.

For example, if the main pool admits users with a skill of 70 or higher, then you can route the people with a skill between 40 and 69 to the retry pool.

To get a valid “range”, enter the skill twice: with an upper and lower value. For example: <basic skill > <70 and <main skill >=40.

We recommend that you don't make your exam and retry pools too lengthy, because performers don't like to do zero-price tasks. 10-20 tasks is enough, depending on complexity.
Is the training considered an active pool when the main pool is closed?

Yes, it is.

How do I make the training optional so that performers can decide themselves whether to take it or not?

Training is designed to select performers for the main task. That's why training must be linked to the main pool and become inactive as soon as the main pool closes.

The user is trained to get access to your paid tasks. If the training is optional, there probably won't be very many people who choose to complete it. Technically, “optional” training can be based on the main pool that includes some training tasks.

To show the training separately from other pools, disable Use project description and use this field to specify that this is an optional set of training tasks. In the pool settings, select the Training type.

Can I implement non-automatic acceptance in the training pool?

You can't use non-automatic acceptance in your training pool.

However, you can create a training pool with the Training type based on your main pool and enable non-automatic acceptance there.

Can I create training for projects where it is not possible to formulate the correct response exactly or review it automatically?

You can't create a training like this, because for the response to be counted as correct it must exactly match the control text.

For projects using free text input or attached files, you can make a pre-selection task with non-automatic acceptance. You can admit good performers to your main pool based on their skill.

How do I create an exam with a preset number of correct responses?

To do this, under Test result, go to Recent values to use and specify the number of recent responses from the performer.

Let's say you need to create an exam with three tasks, one task per page. If the performer succeeds in two out of three tasks, they get the skill.

If your task uses assignment review (non-automatic acceptance), to set up such a rule you need to specify 3 for "Total reviewed responses". As you can see in the screenshot, in the first case, all the performers who completed 3 task suites and whose answers are reviewed will get the skill. In the second case, only those who have 2 or 3 tasks accepted will get the skill.

How do I create a training and honey pots with an exam to get an output response other than the control value?

For a control or training assignment to be counted as correct, it must exactly match the control assignment. To do this, you need to normalize the response text using JavaScript: remove spaces, punctuation marks, special characters, and capital letters, and write the result in a separate output field. Now you can match the processed assignment text against your control text.

Another option for selecting performers for a project of this type is assignment review (non-automatic acceptance).

How do I create a file with training tasks?
For training tasks, you need to:
  • Select the correct responses in the GOLDEN:result column.
  • Fill in the HINT:text column. It stores a hint to be shown if the user selects an incorrect response option.

Other questions

Changing a running pool

If I change the time allocated for one task, will this apply to tasks assigned earlier?

If you change the time allocated for a task, the time value will apply to the tasks that have not yet been taken by the performers. The same applies to the case when you close the pool. A performer who has an assignment in the active status can complete the assignment.

How do I edit or delete tasks uploaded to the pool?

If you uploaded tasks to the pool using “smart mixing”, you can stop the pool and mark up your tasks: edit answers, hints, or delete tasks.

If you uploaded them using a different method, clone your pool and upload the new file with the corrected list of data to be labeled.

I uploaded two files to the training pool. How do I delete one of them?

After uploading, all tasks are put into one list and can't be deleted separately.

  • If the pool hasn't started yet, delete all tasks. To do this, click Delete in the Pool tasks block. Then upload one file to the pool.
  • If the pool already started, delete tasks one-by-one in markup mode.

Other questions