Online testing. Taking online courses to the next level

The test method is one of the main ones in modern psychodiagnostics. In terms of popularity in educational and professional psychodiagnostics, it has been firmly holding the first place in the world psychodiagnostic practice for almost a century.

In this section, tests should be understood as methods that consist of a series of tasks with a choice of ready-made answer options. When calculating the scores for the test, the selected answers receive an unambiguous quantitative interpretation and are summed up. The total score is compared with quantitative test norms, and after this comparison, standard diagnostic conclusions are formulated.

The popularity of the test method is due to the following main advantages:

1) standardization of conditions and results. Test methods are relatively independent of the qualifications of the user (performer), for the role of which even a laboratory assistant with a secondary education can be trained. This, however, does not mean that it is not necessary to involve a qualified specialist with a full-fledged higher psychological education in order to prepare a comprehensive conclusion on a battery of tests;

2) efficiency and economy. A typical test consists of a series of short tasks, each of which usually takes no more than half a minute to complete, and the entire test usually takes no more than an hour. Testing is simultaneously subjected to a group of subjects at once, thus, there is a significant saving in time for data collection;

3) the quantitative differentiated nature of the assessment. The fragmentation of the scale and the standardization of the test make it possible to consider it as a "measuring tool" that gives a quantitative assessment of the measured properties. The quantitative nature of the test results makes it possible to apply a well-developed psychometric apparatus that allows one to assess how well a given test works on a given sample of subjects under given conditions;

4) optimal difficulty. A professionally designed test consists of items of optimal difficulty. At the same time, the average subject scores approximately 50% of the maximum possible number of points. This is achieved through preliminary tests - a psychometric experiment (or aerobatics). If in the course of piloting it becomes known that about half of the surveyed contingent copes with the task, then such a task is recognized as successful, and it is left in the test;

5) reliability. This is probably the most important advantage of tests in educational psychodiagnostics. The lottery nature of modern exams, with the drawing of lucky or unlucky tickets, has long been the talk of the tongue. Lottery for the examiner here turns into low reliability for the examiner - the answer to one fragment of the curriculum, as a rule, is not indicative of the level of assimilation of the entire material. In contrast, any well-designed test covers the main sections of the curriculum (tested area of ​​​​knowledge or manifestation of some skill or ability). As a result, the opportunity for “tailers” to break into excellent students, and for an excellent student to suddenly fail, is sharply reduced;

6) justice. It is the most important social consequence of the advantages listed above. It should be understood as being protected from examiner bias. A good test puts everyone on an equal footing. As is well known, the examiner's subjectivity manifests itself most strongly not in the interpretation of the level of solution of the problem (it is not so easy to call black white, the solved problem - unsolved), but in the biased selection of tasks - easier for one's own, harder for others. Tests provide the most important function of the school as a social filter - the function of "socio-professional selection". The extent to which such selection turns out to be fair is of tremendous importance for the development of society. Therefore, it is so important for everyone who has access to tests and their results to learn a culture of competent and humane use of tests, because only a conscientious and qualified attitude of users to tests turns them into a tool that increases, not lowers, the level of justice in society;

7) the possibility of computerization. In this case, this is not just an additional convenience that reduces the living labor of qualified performers during a mass examination. As a result of computerization, all testing parameters increase (for example, with adapted computer testing, testing time is sharply reduced). Computerization is a powerful tool for ensuring information security (reliability of diagnostics). The computer organization of testing, which involves the creation of powerful information banks of test items, makes it possible technically to prevent abuse by unscrupulous examiners. The choice of tasks offered to a particular subject can be made from such a bank by the computer program itself during testing, and the presentation of a specific task to this subject in this case is as much a surprise for the examiner as it is for the subject.

In many countries, the adoption of the test method (as well as resistance to its adoption) is closely related to socio-political circumstances. The introduction of well-equipped test services in education is the most important tool in the fight against corruption that affects the ruling elite (nomenklatura) in many countries. In the West, test services operate independently of issuing (schools) and receiving (universities) organizations and provide the applicant with an independent certificate of test results, with which he can go to any institution. This independence of the testing service from issuing and receiving organizations is an additional factor in the democratization of the process of selection of professional personnel in society, giving a talented and simply hard-working person an extra chance to prove himself.

The need to assess and verify the level and quality of knowledge arises in any human activity. The problem of the adequacy and validity of test results becomes even more acute with the remote and widespread use of information technologies for testing and testing the knowledge of students, schoolchildren, teachers and other categories of people for whom test results are of great personal importance.

Knowledge level control is an important part of the learning process. It provides feedback in the "trainee-teacher" system. Knowledge control performs controlling, teaching, diagnostic, educational, motivating and other functions in the educational process. To manage the learning process at various stages, the supervising specialist must constantly have information about how students perceive and assimilate educational material.

Control from the point of view of the teacher is a long and laborious part of the work. It can be facilitated and systematized by using the so-called software tools. The problem of implementing the control-related functions falls into three areas - the functions of preparing for control, the functions of conducting control, and the functions of providing feedback in the learning process. A set of tools associated with logic and idea may constitute a tool system. The use of a computer instrumental control system acts as a means of implementing a computer control system.

You can control the activity of students in the presence of special control tests. Tests are a special type of task that allows you to quickly control the degree of assimilation of knowledge and acquisition of skills and abilities by students in the classroom of theoretical and industrial training in a group way, to establish internal and external feedback, on the basis of which students and the teacher carry out the functions of managing the learning process. Testing has long appeared in pedagogy as a method of knowledge control.

Currently, there are many computer programs that serve for testing. There are many products (including multimedia ones) with ready-made test tasks, as well as shell programs for creating tests on your own. There are a number of instrumental programs created by domestic and foreign specialists. The computer tests developed on their basis have the properties inherent in such systems: adaptability, openness, standardization, the possibility of its expansion and extension, the ability to exercise individual and group control of students' knowledge, etc. The test system, due to its versatility, is an automated support for students' independent work, allowing to carry out control and self-control of the level of assimilation of the material, to act as a simulator in preparation for exams.

Chapter 1 Computer Testing

1.1 The essence of the concept "Test"

To understand the essence of tests, it is important to understand the system of concepts. Concepts in general form the basis of any science, and in this sense, the activity of developing and effectively applying tests is no exception. Starting from the 1930s, the science of tests was called bourgeois, all the goals of which were considered "reactionary". And although such judgments are now considered inadequate to the spirit of our time, nevertheless there are publications where they still try to deny the tests scientific character.

The first scientific works on the theory of tests appeared at the beginning of the twentieth century, at the intersection of psychology, sociology, pedagogy and other so-called behavioral sciences. Foreign psychologists call this science psychometrics, and teachers - pedagogical measurement. Since there is no common name in Russian yet, the author called this science testology, which can be pedagogical, psychological or sociological, depending on where it is applied and developed. Unclouded by ideology and politics, the interpretation of the name "testology" is simple and transparent: the science of tests. In the 21st century, Avanesov brought the name of this science into line with its name in the West - Pedagogical measurements.

Let us dwell on the definition of the concept of "test", since it is currently used in a wide range.

Test (English test - test, test, research) is an experimental method in psychology and pedagogy, standardized tasks that allow you to measure the psychophysiological and personal characteristics, as well as the knowledge, skills and abilities of the subject.

Tests began to be used in 1864 by J. Fisher in the UK to test students' knowledge. The theoretical foundations of testing were developed by the English psychologist F. Galton in 1883: the application of a series of identical tests to a large number of individuals, statistical processing of the results, and the selection of evaluation standards.

The term "test" was first introduced by the American psychologist J. Cattell in 1890. The series of 50 tests he proposed actually represented a program for determining primitive psychophysiological characteristics based on the most developed psychological experiments of that time (for example, measuring the strength of the right and left hands with a dynamometer, the speed of reaction to sound, etc.).

The word "test" evokes a variety of ideas. Some believe that these are questions or tasks with one ready-made answer that must be guessed. Others see the test as a form of play or fun. Still others are trying to interpret this as a translation from the English word "test", (test, test, check). In general, there is no consensus on this issue. Moreover, pedagogy textbooks do not write about this. And if they write somewhere, it is often difficult to understand what is written. It is no coincidence that the range of opinions about tests turns out to be too wide: from judgments of ordinary consciousness to attempts at a scientific interpretation of the essence of tests.

In science, there are significant differences between the simple translation of a word and the meaning of a concept. Most often, we meet with a simplified perception of the concept of "test" as a simple choice of one answer from several proposed to the task. Numerous examples of such seemingly "tests" are easy to find in newspapers and magazines, in various competitions and in numerous book publications called "Tests". But even these are often not tests, but something outwardly similar to them. Usually these are collections of questions and tasks designed to select one correct answer from among those offered. They are only superficially similar to the real test. Differences in understanding the essence of tests give rise to differences in attitudes towards tests.

What is the wording of the concept "TEST" in dictionaries?

Big Encyclopedic Dictionary. Test (eng. test - sample, test, study):

1) in psychology and pedagogy - standardized tasks, the results of which are used to judge the psychophysiological and personal characteristics, as well as the knowledge, skills and abilities of the subject;

2) in physiology and medicine - trial effects on the body in order to study various physiological processes in it, as well as to determine the functional state of individual organs, tissues and the body as a whole;

3) in computer technology - a control task for checking the correct operation of a computer;

4) in pattern recognition, a set of functionally interdependent features that characterize an image (class).

Modern explanatory dictionary of the Russian language T.F. Efremova. Test:

1) a task, a test of a standard form, based on the results of which one can judge the ability, predisposition, etc. someone to something, as well as about the knowledge, skills of the subject;

2) method of research, diagnostics, which consists in a trial effect on the body (in physiology, medicine);

3) a questionnaire used in sociological research.

4) a problem with a known solution, designed to check the correct operation of a computer (in computer technology).

Explanatory dictionary of the Russian language D.N. Ushakov. Test (English test) (psych.):

Psychotechnical test, which consists in the fact that the subject is asked to solve one or more tasks to determine one or another of his abilities (memory, attention, speed of reaction, etc.).

Today, there are many types of tests, so it is hardly possible to give a universal definition for all these types.

Analysis of the literature showed that there are different formulations of the concept of "test". But regardless of the type, purpose, I would give the following definition of the test: computer testing requirement methodology

A test is one of the knowledge control methods that allows the teacher to establish the actual, theoretical knowledge of students and evaluate them in a fairly short time. It should be noted that the test does not take into account the individual characteristics of a person.

1.2 Specifics of computer testing and its forms

Since the beginning of the 21st century, computer testing (CT) has become widely used in education, in which the presentation of tests, the evaluation of students' results and the issuance of results to them is carried out using a PC. However, one should turn to computer testing in cases where there is an urgent need to abandon traditional blank tests: when conducting exams for children with disabilities, with serious visual or hearing impairments, etc. The test generation stage can technologically proceed in different ways, depending on including by entering blank tests into the computer. To date, there are numerous publications on computer testing, software and tools for generating and presenting tests have been developed.

Computer testing can be carried out in various forms, differing in the technology of combining tasks into a test. Some of them have not yet received a special name in the literature on test issues.

The first form is the simplest. A ready-made test, standardized or intended for current control, is introduced into a special shell, the functions of which may vary in degree of completeness. Usually, during the final testing, the shell allows you to present tasks on the screen, evaluate the results of their performance, form a matrix of test results, process it and scale the primary scores of the test subjects by transferring them to one of the standard scales, so that each test subject receives his own score and score protocol for the test tasks.

The second form of computer testing involves the automated generation of test options, carried out with the help of tools. Options are created before the exam or directly during it from a bank of calibrated test items with stable statistical characteristics. Calibration is achieved through a long preliminary work on the formation of a bank, the parameters of which are obtained on a representative sample of students, as a rule, for 3-4 years using blank tests. The content validity and parallelism of the variants are ensured by strictly regulated selection of tasks for each variant in accordance with the test specification.

The third form - computer adaptive testing - is based on special adaptive tests. The ideas of adaptability are based on the considerations that it is useless for a student to give test tasks that he will surely perform correctly without the slightest difficulty or is guaranteed not to cope with them due to high difficulty. Therefore, it is proposed to optimize the difficulty of tasks, adapting it to the level of preparedness of each subject, and to reduce the length of the test by eliminating part of the tasks.

When conducting computer testing, it is necessary to take into account the psychological and emotional reactions of students. Negative reactions usually cause various restrictions, which are sometimes imposed when issuing tasks in computer testing. For example, either the order in which tasks are presented is fixed, or the maximum possible time for completing each task, after which, regardless of the desire of the subject, the next test task appears. Students with adaptive testing are dissatisfied with the fact that they do not have the opportunity to skip the next task, view the entire test before starting work on it and change the answers to previous tasks. Sometimes schoolchildren object to computer testing because of the difficulties that arise when doing and recording mathematical calculations, etc.

To reduce the influence of students' experience with a computer on test scores, it is recommended to include special instructions and training exercises for each innovative form of tasks in the shells for computer testing. It is also necessary to familiarize students with the program interface in advance, conduct rehearsal testing and separate students who do not have sufficient experience with a PC into independent groups in order to additionally train them or give them a blank test.

Thus, computer testing acts as a tool for managing the educational process, as an element of feedback, which makes it possible to analyze the educational process, make adjustments to it, i.e. exercise full control over the learning process. The constant use of computer tests as an intermediate control of progress defines the educational process as a system of continuous control and self-control of students, which allows the teacher to receive "feedback", and students the opportunity to monitor their level of preparedness throughout the entire training.

1.3 Advantages and disadvantages of computer testing

The advantages of computer testing are:

Objectivity. The factor of subjective approach on the part of the examiner is excluded. Processing of test results is carried out through a computer;

Validity. The “lottery” factor of a regular exam is excluded, which can get an “unlucky ticket” or a task - a large number of test tasks cover the entire volume of the material of a particular subject, which allows the test-taker to show their horizons more widely and not “fail” due to a random gap in knowledge;

Simplicity. Test questions are more specific and concise than ordinary exam tickets and tasks and do not require a detailed answer or justification - it is enough to choose the correct answer and establish a correspondence;

Democracy. All test takers are in equal conditions, test results are transparent;

Mass and short duration. Possibility for a certain set period of time to cover a large number of test-takers with final control. At the same time, use the remaining time to study new material or consolidate the old;

Manufacturability. Conducting an exam in the form of testing is very technologically advanced, as it allows the use of automatic processing;

Reliability of information about the amount of learned material and the level of its assimilation;

Reliability. The test score is unambiguous and reproducible;

differentiating ability. Due to the presence of tasks of various levels of difficulty;

Implementation of an individual approach to learning. Individual testing and self-testing of students' knowledge is possible.

Along with the advantages of computer methods, there are also disadvantages:

    Communication between a person and a computer has its own specifics, and not everyone is equally calm about computer testing. For example, if the testing procedure is delayed or the content of the test does not interest a person, a positive attitude can be replaced by the opposite: they will tire and annoy the monotony and monotony of work, the "stupidity" of questions and tasks. Sometimes a negative attitude towards computer testing is also caused by the lack of feedback. And when the tested person does not receive feedback, the probability of erroneous answers increases (you can misunderstand the instructions, mix up the keys for answers, etc.).

Special studies have been conducted to determine how people feel about computer testing. It turned out that some people have the so-called psychological barrier effect, and some have the effect of overconfidence. It happens that a person is not able to cope with the task at all because he is "afraid" of the computer. It is also possible to include psychological defense mechanisms associated with the unwillingness of the test person to reveal himself, the desire to avoid excessive frankness or deliberate distortion of the results;

    With computer testing, specialists deal only with the results obtained. They do not see the person being tested, do not communicate with him, and therefore do not have additional information about him, cannot find out his actual amount of knowledge;

    Test control does not contribute to the development of oral and written speech of students;

    The breadth of coverage of topics in testing has a downside. The student during testing, unlike an oral or written exam, does not have enough time for any in-depth analysis of the topic;

    There is an element of randomness in testing. For example, a student who did not answer a simple question may give the correct answer to a more complex one. The reason for this can be both an accidental mistake in the first question, and guessing the answer in the second. This distorts the test results and leads to the need to take into account the probabilistic component in their analysis.

Chapter 2. Computer control of knowledge

2.1 Classification of types of computer tests

Obviously, the first task in testing the acquired knowledge should be to determine the goals of control. So in universities, it is more necessary to check the depth of knowledge of academic disciplines among students, the ability of future specialists to think logically, compare various objects and phenomena, draw the right conclusions and make optimal decisions. This means that a set (database) of control tasks should cover the academic discipline as fully as possible, and their thematic division should allow for step-by-step control in the process of studying the subject, identify individual knowledge gaps of students, adjust curricula, etc.

An important place in the formation of the base of tasks is their formulation. Like any sentence, tasks are divided into explicit and implicit, interrogative and affirmative, judgments, opinions and other questions. The variety of their forms, which carries the richness of the language, the abundance of special terms, depends on the art of the teacher. The use of such tasks helps to increase the students' ability to think logically, as well as the level of their general culture.

Another important point is to determine the correctness of the student's answer to the proposed questions. There are various answers that are included in the program. It is preferable that the student "respond" to the computer, as if orally, as a teacher (open answer form). It is possible that such expert systems based on specially developed knowledge bases will appear in the near future with the introduction of fifth-generation computers. In the meantime, some experts have begun to create knowledge bases. This rather interesting and complex problem has the main drawback, which is initially present in it - the subjectivity of the system, based on the assessments of events and phenomena by individual, albeit sometimes very authoritative, specialists. Perhaps it would be most correct at the present time to master computer control systems based on databases. In this case, they usually resort to various ready-made forms of answers - templates.

Such a form of answers has become widespread, when the respondent is offered a pre-formed set of answers to select one or more that, in his opinion, are correct (the form of closed answers). The program automatically evaluates the correctness of the choice made. In another case, the controlled person enters from the keyboard some formulations or individual words that are the answer to the question posed (the form of half-open answers). These answers are not displayed on the computer screen, but the program contains the maximum possible, in the opinion of its authors, set of answers. It is believed that in most cases the program has the necessary modifications and, after making a comparison, it will be able to give its opinion on the correctness of the answer. There are other options as well. Each method of forming a response to the subjects has its own advantages and disadvantages. Here you should adhere to the goal and choose the most suitable for its implementation.

In this regard, it can be proposed to use a single set of answer options for all control tasks on the topic. The formulations should be of a general nature and contribute to the identification of the ability to control logical thinking, which is more important and valuable than memorizing individual factual data. With sufficient skill of the teacher, with the help of the answers formulated in this way, it is possible to determine knowledge and individual facts, events.

In schools in developed countries, the introduction and improvement of tests proceeded at a rapid pace. Diagnostic tests of school performance have become widespread, using the form of an alternative choice of the correct answer from several plausible ones, writing a very short answer (filling in the gaps), adding letters, numbers, words, parts of formulas, etc. With the help of these simple tasks, it is possible to accumulate significant statistical material, subject it to mathematical processing, and obtain objective conclusions within the limits of the tasks that are presented for test verification. Tests are printed in the form of collections, attached to textbooks, distributed on computer diskettes.

Learning tests applied at all stages of the didactic process. With their help, preliminary, current, thematic and final control of knowledge, skills, accounting for academic performance, academic achievements are effectively provided.

Learning tests are increasingly penetrating into mass practice. Currently, a short-term survey of all students in each lesson using tests is used by almost all teachers. The advantage of such a check is that the whole class is busy and productive at the same time, and in a few minutes you can get a snapshot of the learning of all students. This forces them to prepare for each lesson, to work systematically, which solves the problem of efficiency and the necessary strength of knowledge. When checking, first of all, gaps in knowledge are determined, which is very important for productive self-learning. Individual and differentiated work with students to prevent academic failure is also based on ongoing testing.

Naturally, not all the necessary characteristics of assimilation can be obtained by means of testing. For example, indicators such as the ability to concretize one's answer with examples, knowledge of facts, the ability to coherently, logically and convincingly express one's thoughts, some other characteristics of knowledge, skills, and abilities cannot be diagnosed by testing. This means that testing must necessarily be combined with other (traditional) forms and methods of testing. Correctly act those teachers who, using written tests, enable students to verbally justify their answers. In the framework of the classical theory of tests, the level of knowledge of the subjects is assessed using their individual scores, converted into certain derived indicators. This allows you to determine the relative position of each subject in the normative sample.

Another approach to creating tests and interpreting the results of their execution is presented in the so-called modern theories of pedagogical measurements- Item Response Theory (IRT), which was widely developed in the 60s - 80s in a number of Western countries. Recent studies in this direction include the works of B.C. Avanesova, V.P. Bespalko, L.V. Makarova, V.I. Mikheeva, B.U. Rodionova, A.O. Tatura, V.S. Cherepanova, D.V. Lyusina, M.B. Chelyshkova, T.N. Rodygina. E.N. Lebedeva and others.

The most significant advantages of IRT include measuring the values ​​of the parameters of the test subjects and test items on the same scale, which allows you to correlate the level of knowledge of any test subject with a measure of the difficulty of each test item. Critics of the tests were intuitively aware of the impossibility of accurately measuring the knowledge of subjects of different levels of training using the same test. This is one of the reasons why, in practice, they usually tried to create tests designed to measure the knowledge of the subjects of the most numerous, average level of preparedness. Naturally, with such an orientation of the test, the knowledge of strong and weak subjects was measured with less accuracy.

In foreign countries, in the practice of control, the so-called success tests, which include several dozen tasks. Naturally, this allows you to more fully cover all the main sections of the course. There are two types of tasks used:

a) requiring students to independently compose an answer (tasks with a constructive type of answer);

b) tasks with a selective type of answer. In the latter case, the student chooses from among the presented answers, which he considers correct.

It is important to note that these types of assignments are subject to significant criticism. It is noted that tasks with a constructive type of answer lead to biased assessments. So, different examiners and often even the same examiner give different marks for the same answer. In addition, the more freedom students have in answering, the more options for evaluating teachers.

When creating knowledge control tests, you can be guided by other classifications of test types. Usually they are divided into:

    "achievement" tests;

    standard "achievement" tests;

    intelligence tests;

    propensity tests;

    predictive tests;

    criterion-oriented tests;

    aptitude tests.

Existing other classifications are practically reduced to the types mentioned above. Further, when making a choice, it is necessary to focus on pedagogical provisions, according to which the control system should be acceptable for testing professional knowledge. The filling of its content should serve to determine both the level of intelligence and abilities controlled in a particular field of knowledge. The form of the verification procedure should include individual and/or group monitoring.

As studies have shown, it is most expedient to use a criterion-oriented approach to test students.type oftests.

Criteria Based Test allows for a more complete individual and collective program control of the amount of acquired knowledge; receive scores that allow you to compare the level of knowledge of students both within a separate group and between them; identify the results achieved by each individual student during the test in a wide range of values ​​(points).

Appeal to this type of tests is also caused by the fact that with its help it is possible to identify the level of knowledge of students according to a predetermined, common for all volume and content of educational material. At the same time, two components of this type of tests are clearly visible. On the one hand, the possibility of obtaining data on the individual knowledge of each student, on the other hand, the possibility of comparing the data obtained in a wide range of study groups, provided that an adequate testing environment is created.

Ultimately, it is important to determine what each individual student knows and can do, not what he is at the level of other students. A well-formed content (content) of the test ensures that each student receives a rating (individual integral indicator), at the same time provides the teacher with data characterizing the ability of any student to study in comparison with his fellow students. Thus, it is possible to successfully implement these two problems simultaneously. The student's performance of the test is not assessed in accordance with a certain norm, but is determined by the degree of mastering the discipline indicated in the test, the achievement of a certain level of performance of the proposed tasks. Thus, with the help of these tests, it is possible to identify the degree of knowledge of each individual student of both individual tasks and sections of the curriculum; the point (height) of their assimilation of a particular discipline.

2.2 Requirements for computer-aided testing systems

Recently, a fairly large number of various software tools for developing tests and testing have been offered to the attention of teachers. However, many of them cannot implement modern requirements for the quality of pedagogical control materials (PCM), because do not themselves meet the requirements for computer testing systems:

    the possibility of using four forms of tasks of the classical pedagogical test;

    obtaining and accumulating a matrix of response profiles of subjects for dichotomous and polytomic assessment of the results of tasks;

    adjustment and re-arrangement of test tasks depending on the results of statistical processing of test results;

    protection of the resulting matrices from unauthorized access.

In addition, from the point of view of a subject teacher, I would like to have the following features in computer testing systems:

    the use of multimedia technologies in testing. In most test shells, tasks are presented in the form of text (sometimes using graphics). Multimedia testing systems combine text, graphics, animation and video materials in the most effective combinations and simultaneously use all communication channels to transmit information: text, image and sound. Sounding questions and answer options allows you to exclude the mistakes of the subject in case of incorrect reading of the task; and in disciplines related to the study of foreign languages, the submission of material in audio form is mandatory. Graphics (drawing, diagram, photograph) can be included in the formulation of both questions and answer options. In this case, a graphical answer can be represented by selecting a certain area on the screen (for example, an area on a graph of a function, a point or a function). The truth or falsity of the answer chosen by the subject can also be represented graphically. The use of animated graphics, video clips make it possible to make tasks for determining the sequence of actions more visual, to demonstrate the development of the situation depending on the answer chosen by the subject, etc.;

    the use of pseudo-test tasks, for example, chain, text, situational and even non-test, for example, crossword puzzles, rebuses, etc.;

    the use of a prepared test not only for control, but also for self-control of knowledge. In this case, after completing such a test, the student receives information about the success of his actions, and after the end of self-control, he can return to the tasks to which he gave incorrect answers and try to answer again. Thus, the training element will be implemented;

    the use of adaptive testing algorithms that determine the choice of the next task depending on the testee's answers to previous questions;

    the use of hypertext links in self-control and training modes;

    testing in the network version.

The additional features listed above would expand the scope of computer testing systems.

One of the factors that determine the success of creating tests is the right choice of hardware and software.

The term "technical teaching aids" appeared in the second half of the 60s. and it was understood as systems, complexes, devices and equipment used for the presentation and processing of information in the learning process in order to increase its effectiveness. According to their functional purpose, they are usually divided into three main classes: information, control, training. Controlling technical teaching aids are designed to determine the degree and quality of assimilation of educational material. The concept of informatization of education in our country defines the computer as the main material basis of modern education, its main technical means.

The parameters of the computers used largely determine the possibilities for effective control of students' knowledge. The best characteristics in operation in the country's universities were shown by universal PCs or computers that can combine the capabilities of almost all types of technical teaching aids. The important advantages of PCs include their ability to create conditions for the student to make independent decisions, i.e. to individualize the learning process by creating adaptive computer programs. They allow you to successfully automate the educational process, including the knowledge control procedure. According to statistics, from 80 to 90% of computers operating in various countries of the world are IBM-compatible.

An IBM-compatible computer is the most suitable technical tool for improving the quality of education and monitoring students' knowledge in modern conditions. It works with the help of system, instrumental and applied computer programs. The second type of programs is of greatest interest in the context of this work. Tool programs written in high-level languages ​​allow programmers to create programs for special purposes - user programs, application programs. Application programs also include student knowledge control programs. The main principles of creating such programs are that they are focused on a specific course of study and allow qualified users (programmers and teachers) to create original programs for training and monitoring students' knowledge.

The limited possibilities of this work do not allow a more detailed and deep consideration of many important problems associated with testing. However, it is simply necessary to give a brief description of the properties of testing.

adaptability – the ability of the system to adapt to changing conditions (hardware and software).

openness is determined by the ability of the system under the influence of a qualified user to adapt to the control of specific academic disciplines.

System standard expressed by the use of functions, design, etc., used in public programs. A trained user feels more comfortable, and an untrained user can use the experience gained when working with other programs.

uniformity is to create such a system, on the basis of which it is possible to create similar ones. A big mistake of the developers of computer knowledge testing systems is the development of highly specific programs for a particular academic subject. Obviously, such activity is completely inefficient and leads to unjustified labor costs for both the programmer and the expert.

The need for unification of control programs logically follows from the formalization of the subject area. Since it is advisable to use computer testing in the form of programmed control only in easily formalized subject areas, therefore, it makes sense to develop universal methods for presenting control questions, a unified system for their assessment, and to create the content itself in the form of separate, pluggable databases.

Possibility of expanding and building up the system is also an important characteristic. Its provision creates confidence for the user in the further, continuous use of the system, in its modification, as well as in the application of various solutions for its improvement.

An equally important property is the ability of the system to exercise individual and group control of students' knowledge. In addition to the obvious advantages, it makes it possible to use the system in various conditions, which are determined by the teacher, the author of the test, based on the learning objectives.

If all the above properties are taken into account, then as a result a system will be formed, working with which students will have the opportunity to test their knowledge on each topic of the academic discipline at a convenient individual pace in self-control mode, identify gaps and then eliminate them. At the same time, students will increase their motivation for learning and, to a large extent, stressful situations will be removed, a deep study of the educational material will be ensured, confidence will appear in the knowledge they have and the adequacy of the assessment they receive based on the results of control.

In addition to this requirement, the knowledge control system must also meet the following criteria:

    work in a computer network (local and global), the possibility of testing simultaneously with a group of respondents;

    the system should provide for the possibility of creating new tests and analyzing test results;

    the system should contain algorithms for analyzing test results (validity of tests, assessment of the degree of their complexity, comparison of test results of different groups, etc.);

    the system should provide high flexibility in choosing the types of questions and tasks, but at the same time it should have a high degree of security;

    the instrumental system must ensure the differentiation of access rights to all its elements.

An important role in testing knowledge is played by objectivity, accuracy of results and the minimum probability of estimation error, the exclusion of the influence of any subjective factors, as well as the almost identical testing conditions for all students, which is achieved in our case with the help of computers and special programs. Ensuring the depth and completeness of control is also achieved by asking the student to answer several hundred questions. This is at least an order of magnitude higher than similar values ​​in traditional knowledge testing. At the same time, both a differentiated and an integrated assessment of the level of mastery of educational material in a particular discipline is achieved. Control is carried out immediately after the completion of the study of each section of the curriculum. The teacher receives prompt and objective information about the results of students' mastering this section. Therefore, the data obtained can be used to make appropriate adjustments to the content and methodology of the educational process.

2.3 Formation of test tasks for computer control of knowledge

Computer testing for the humanities disciplines of the university is almost completely implemented during the control work, control over the independent work of students (input, current, thematic), partially - colloquiums, tests and exams (landmark, final, final control).

pedagogical test - this is a system of faceted tasks of a certain content, increasing difficulty, specific form, which allows you to qualitatively assess the structure and effectively measure the level of knowledge, skills and ideas.

It is very difficult to implement the facet property in humanitarian knowledge due to weak formalization and non-articularity.

On the one hand, test tasks (TK) make up a very high percentage, perhaps 80-90% of computer control programs in any humanities discipline. On the other hand, not all content lends itself to transformation by the forms of the test task. Many proofs, verbose descriptions are difficult to express, and even not expressed at all in a test form.

The issue of filling databases seems obvious and therefore, as a rule, does not cause difficulties either in theory or in practice. At first glance, the development of testing questions and the definition of response standards is available to any teacher. However, in fact, the situation in this area is exactly the opposite of what it seems at first glance.

It's really easy to formulate a question. However, most developers do not ask themselves the main question: what is the purpose of this question? Which section of the topic under consideration does this question cover? Is the question formulated correctly, does it cause discrepancies, does it allow for ambiguous answers, how is it perceived by students from the point of view not of the teacher (having a large amount of knowledge compared to the students), but from the point of view of the theoretical course passed by the students?

An attempt to answer these questions shows that, first of all, the database should be developed not by an enthusiastic teacher, but by a high-level specialist in this subject area. In addition, no matter how high the level of the developer, any person is capable of making mistakes or incorrectly formulating certain provisions. Therefore, the test base, before its commissioning, must necessarily pass through an assessment of at least a methodological council in this specialty.

However, no commission is able to determine the perception of control questions by students. This can only show real testing. Moreover, such an assessment is technically very simple - only a cumulative statistical analysis of the answers to each specific question is needed. To do this, the question must be uniquely identifiable. The analysis of such statistics, especially when conducting control testing in different training groups, gives a dual result: a question that no one can give the correct answer to, or incorrectly formulated, or this topic is extremely poorly disclosed in the learning process. The question, to which everyone answers correctly, is either poorly formulated (has hints in the text of the question), or this topic is very well disclosed in the learning process and correctly assimilated by the whole group.

Such ambiguity in the analysis of statistics leads to the question of the time and form of statistical analysis and testing in general.

Maintest task forms are: tasks of an open form, closed, for compliance, for establishing the correct sequence.

1. Tasks with the choice of one or more correct answers . These tasks include the following types:

1.1. Choice of one correct answer according to the principle: one is correct, all the others (one, two, three, etc.) are incorrect.

For example, with a lack of which vitamin, there is a violation of the growth and development of bones:

a) vitamin A;

b) vitamin B;

c) vitamin C;

d) vitamin D.

1.2. Select multiple correct answers.

1.3. Choice of one, the most correct answer.

For example, organic substances include:

a) proteins;

b) proteins and carbohydrates;

c) proteins, carbohydrates and fats;

d) proteins, carbohydrates, fats and mineral salts.

Each of the answers is generally plausible, but the 1st and 2nd answers are not complete. The 4th answer is also not correct, since mineral salts do not belong to organic substances.

2. Open form tasks . The tasks are formulated in such a way that there is no ready answer; you need to formulate and enter the answer yourself, in the space provided for this.

3. Compliance tasks , where the elements of one set need to match the elements of another set.

For example, match:

Habitat

organisms

1) Organism

a) carp

2) Water

b) jellyfish

3) Soil

c) mole

4) Ground-Air

d) earthworm

5) Land-water

e) sparrow

e) tiger

g) roundworm

h) frog

i) dysenteric amoeba

4. Tasks to establish the correct sequence (calculations, actions, steps, operations, terms in definitions).

The listed forms of computer representation of test tasks do not exhaust their diversity. Much depends on the skill and ingenuity of the teacher. When creating tests, it is important to take into account many circumstances, for example, the personality of the person being tested, the type of control, the methodology for using tests in the educational process, etc.

The choice of form depends on:

    testing goals;

    test content;

    technical capabilities;

    the level of preparedness of the teacher in the field of theory and methodology of test control of knowledge.

The best can be considered a test that has a broad content, and it covers deeper levels of knowledge.

The test task includes:

a)statement part, describing the situation (may be absent), which does not require any active actions from the test person;

b)the procedural part containing suggestions for the student to perform any specific actions - choose the correct element from the proposed set, establish a correspondence or correct sequence, name the date, write down the name, etc. The procedural part is a type of information, after receiving which, the student is required to take active actions related not only to the study and analysis of the material contained in the task, but also to compiling and entering an answer;

c) eelements of choice .

General rules for all forms of test tasks. It is necessary to monitor the correctness of the wording of the task. The test task should be formulated clearly, clearly, specifically, without ambiguity in the answer. The optimal number of response items is 5-8, but there are exceptions.

The procedural part of the test task should be as short as possible. – do not exceed 5-10 words. The test task must be formulated in the affirmative form. It is not allowed to define a concept through enumeration of elements that are not included in it.

For all forms of the test task, there should be a standard instruction. All elements in the tasks should be selected according to some specific principle chosen by the author. Preference for a large number of test tasks that are simple in structure, rather than a small number of complex ones.

In complex separative test tasks, it is necessary to list all possible alternatives, because otherwise, the student's idea of ​​the classification or structure of the basis object is distorted.

Test tasks of an open form must meet the following requirements:

    the complementary word or phrase is placed at the end and must be the only one;

    it is necessary to supplement only the important;

    it is desirable that when formulating the task, the addition should be in the nominative case;

    all dashes for the addition must be the same length;

    it is desirable to give the trainee a sample answer.

Closed form of test tasks must meet the following requirements:

    equal likelihood of elements;

    it is desirable that all selection elements are equal in length;

    it is desirable to use one object or an equal number of objects in selection elements;

    it is necessary to exclude repeated words in the answers;

    all items must be true statements, but only one of them is the correct answer for this item, and the rest may be true for other items in this test or in other tests.

Compliance Test contain two sets, the right column is for the choice, the left column is for the answer. In the right one, for example, 1-3 more elements are formed, so that during the last substitution, the student has a choice, and not an automatically substituted remainder. All elements are true statements.

In test tasks to establish the correct sequence the principle of forming elements alphabetically can be chosen. If an alphabetical list is the correct answer, then the items are randomly placed.

In order to level out the borrowing of an answer from a neighbor in all tasks of this form, it is necessary to formulate a test task in 2-3 options that are synonymous in meaning, which are selected randomly. In tasks of a closed form and in tasks for compliance, elements are submitted using a random arrangement sensor. The elements of the task in these forms are formed according to the principle of "main" and "reserve" players. For example, with 5 elements given to a student, the author does not form a set of "1 correct + 4 incorrect", but "1 correct + 4 main incorrect + 5 spare incorrect", where 9 incorrect ones are randomly selected.

Methods for assessing test quality criteria. The classical theory of tests is based on the theory of correlation, the main parameters of which are reliability and validity. Reliability- the stability of the test results obtained by its application. Validity– suitability of the test, i.e. the ability to qualitatively measure what it was created for according to the intention of the authors.

There is a strict scientific theory of tests, which allows methodologically and methodically to justify their use and processing of test results. An evidence-based test is a method that meets established standards of reliability and validity (a value between 0 and 1; the closer to 1, the better the test).

According to the classification, there are tests focused onnorm (ranking by strong - weak students) and tests focused oncriterion (sorted by difficult - easy tasks).

By the nature of the actions, the tests are divided intoverbal (expressed in words) andnon-verbal (represented by images).

According to the degree of homogeneity of tasks, tests arehomogeneous (in one discipline) andheterogeneous (for several disciplines).

By goalsuses: the beginning of training, progress and difficulties in the learning process, achievements at the end of training. The practice of higher education shows that criteria-oriented, overwhelmingly verbal, homogeneous, aimed, as a rule, at the end of education tests are most applicable.

Assessment of test items can bepolytamic (if one of the 10 elements of the task was done incorrectly, then the total score is 9);dichotomous (done all the elements - 1 point, did not - 0 points).

According to the degree of difficulty, tasks can besingle-level , i.e. with a weighting factor equal to one andmultilevel with a weighting factor from 0 to N.

The length of a test is the number of tasks included in the test. Classical test theory says that the longer the test, the more reliable it is. But practice shows that if the test is very long, then motivation and attention deteriorate. In practice, the length of the test should be determined empirically, taking into account validity, testing time, etc. The optimal length of the test, as shown by theory and practice, is 30-60 tasks. The ratio of test length to the number of test tasks in the bank should tend to a ratio of 1:10.

Each test hasoptimal testing time - the time from the start of the testing procedure to the onset of fatigue. The spread in the characteristics of the fatigue threshold is quite large - from 20 to 100 minutes in one age group. The main causes of fatigue: age, motivation, monotony of the work performed, individual characteristics of the subjects. Therefore, it is necessary to maintain motivation at the right level, diversify the work as much as possible by introducing all forms of tasks and non-verbal support into circulation, and also adapt the software product according to the individual characteristics of the subjects. The average approximate time to the moment of fatigue for students is 50-80 minutes (maximum duration). And the minimum depends on the forms, number and difficulty of tasks, elements in the task. For example, for an easy test task of a closed form with the choice of one element from the proposed ones, 10-15 seconds are enough. In the process of approbation, the actual dates should be clarified.

The ratio of task forms in the test . Choosing the form of a test taskdepends on the content of the course, the purpose of creating the test, the skill of the developer. The average layout can be as follows. In a test with a length of, for example, 60 tasks, it is recommended that no more than 10 open-form test tasks are recommended, approximately 10 for ratio and sequence, it is more expedient to give the remaining 30 tasks in a closed form.

2.4 Types of computer control questions

Probably the biggest misconception of the developers of most control programs is the use of the so-called single sample: the student is asked a question, he is given several ready-made answers (as a rule, five - it is more convenient to derive an assessment), one of which is correct. Despite the fact that there really is a class of control questions that can be implemented in a similar way, despite the fact that the probability of guessing (20%) is quite low, looping exclusively on a single sample excludes the richest possibilities for using pedagogical technologies when conducting control.

In addition, it is no secret to anyone how students bypass this type of control - sooner or later a printout with the correct answers falls into the hands of students, and the order of answers is simply memorized or entered on a cheat sheet. Not all (in fact, only a few) knowledge control systems implement the function of changing the location of the correct answer with each test.

What types of questions can be used in the computer version of programmed control?

Arbitrary type, or, keyboard input. The most powerful tool for checking all sorts of terms, constants, dates. However, its implementation, as a rule, is very mathematically complex and therefore is ignored by most developers. The problem lies, first of all, in the fact that the entered phrase must be subjected to a syntactic, and ideally, a semantic analysis that models the variants of the respondent's possible thinking. In addition, a student can make a typo, and in most areas of knowledge such typos cannot be considered an error - and this requires a very flexible implementation of computer logic, which not every programmer can do. A lot can also be said about the possibility for students to use various synonyms when entering an arbitrary answer, which may not be provided by the database developer and at the same time may be absolutely or partially correct. In addition, in an arbitrary question type, there may be several possible answers.

There are also a number of variations of the custom question type:

Entering multiple answers in a specific sequence can be used in questions about the strict sequence of any operations, relative positions, etc. The type of question is as complex for programming as it is arbitrary, it is very difficult to design and causes certain difficulties for students, since it requires not only error-free input of answers, but also their error-free relative position. However, despite its rather rare use, this type is indispensable and is a powerful tool for determining the level of knowledge of the student in matters, for example, the relative position of organs in topographic anatomy, the sequence of transformation of a substance in chemistry, the sequence of actions in various kinds of repair work, etc. .;

Entering missing parts of lines or letters, despite its apparent simplicity, it is an indispensable tool for testing understanding of various language constructs (in Russian and foreign languages, in programming, etc.). Unlike the standard "Free" type of question, as a rule, it assumes unambiguous answers and therefore is easier to program;

Selective question type. The classic version, which the vast majority of developers consider necessary and sufficient for computer testing. This type of question may imply one or more correct answers from those offered. Some theorists divide these two varieties into different types of questions, but from the point of view of formal logic, these varieties are absolutely equivalent. The question is only in the methodology for deriving results for these varieties.

The computer implementation of this type is unusually simple. Perhaps, this is the reason for its widespread use in various kinds of testing programs. To implement this type, even basic knowledge in any programming language or in programmable office systems such as Excel or Quattro is sufficient.

The selective question type also has varieties:

Alternative type is the most simplified form and assumes a ready-made answer already in the text of the question. The subject only has to indicate whether this answer is correct or not (i.e. answer "Yes" or "No"). Despite the apparent simplicity, this type can be successfully used in some areas of knowledge.

A variation on the selective type is the question type called " Selection". However, the difference between it and the standard selective type is only in the output system.

Sequential question type. The most difficult type for students, although quite simple to implement, gives the teacher a powerful tool for assessing not only specific knowledge, but also logic.

A simplified version of the serial type - "Rearrangement" involves asking the student a question and being given a set of ready-made correct answers. His task is to arrange these answers in the required sequence.

Like the "Sequence" type, this variety can be used in those subject areas where a clear knowledge of the sequence of operations, actions or the correct relative position of objects is required. However, unlike the "Sequence" type, this variety can be used much more widely, since it does not contain the "pitfalls" of incorrect formulation of any term by students - all answers are already on the screen.

A more complicated version of the serial type - "Arrangement" is the most complex of all types, both in terms of the complexity of programming and the complexity of its perception by students. However, it is this type that provides the widest opportunities for testing logic. The construction of a question of this type consists formally in the construction by students of a graph of a logical structure. The text of the question lists certain numbered provisions (paragraphs), and the text of the answers contains the conclusions or facts corresponding to these paragraphs. The student is required to match the items listed in the question with the ready-made answers.

Chapter 3. Methodology for conducting a computer survey of students

3.1 Methodology for conducting a programmed survey

The problem of organizing collective forms of educational activity is especially actualized by the specifics of conducting classes in classrooms equipped with a local computer network. The use of the network provides the teacher with new opportunities to manage the educational process, on the one hand, on the other hand, it provides the possibility of effective independent educational work of students to complete practical tasks.

A local computer network makes it possible to present any action in a detailed sequence of operations, show its result, conditions for execution; fix intermediate operational results, allows interpreting and evaluating each step of the trainees when performing tasks, etc.

For a teacher, a computer network allows both final and operational control, accumulation of final information related to both an individual student and the entire group as a whole. A computer network allows you to qualitatively change the system for checking the activities of students, while providing flexibility in managing the educational process. Working on one common database allows you to check the correctness of the execution of all tasks and not only fix the error, but also determine its nature, which helps to eliminate the cause that caused its occurrence in time.

The selection of topics and possible options for test tasks are prepared in advance. The content of test tasks is formulated in such a way as to show the applicability in practice of the knowledge and skills necessary for mastering the material.

Individualization of learning can be realized by differentiating the content of the presented educational material, as well as by selecting test tasks according to the level of complexity.

The selection of the degree of difficulty of tasks plays an important role. Excessively simple tasks do not require mental effort from the trainee, and therefore hinder the formation of the necessary skills. The correct performance of relatively easy tasks is not experienced by the trainee as success. At the same time, many of the mistakes activate the creative potential of students and have a positive effect on the activation of cognitive needs and on the motivational sphere.

The means of creating educational and cognitive motivation can be both the content of the test task and the form of organization of activities (learning and playing, group, individual).

At the discretion of the teacher, students may be offered a plan for completing a test task, and work with workbooks and literature is allowed. The teacher can maintain the interest of the trainees by engaging in the process of discussing individual nuances when performing a test task.

The lesson can be built in such a way as to direct it to the maximum development of the trainees. To do this, at the moment when they have a feeling of completion of the proposed test tasks, the teacher, in order to enhance the further cognitive activity of the students, can put before them problematic questions on the topic being studied, causing cognitive interest. As a result of resolving this difficulty, students receive new knowledge and skills. Thus, the work of a group of trainees on the performance of test tasks can be carried out in the mode of sequential problem solving.

After the trainees complete the work on the test tasks, the teacher can organize a group discussion, a collective discussion of the tasks that caused the greatest difficulties. It is expedient to include in the discussion questions that have remained unconsidered, to find out possible ways to solve them. Thus, the teacher not only can exercise control, but also becomes the organizer of the process of independent active acquisition of new knowledge by students.

3.2 Handling test results

The issue of deriving an assessment is probably one of the most complex and controversial in pedagogy. Indeed, it is easy to ask a question, but to determine whether the student answered correctly, how correctly he answered, whether he thought correctly, despite the wrong answer, is a task that is far from completely solved. Accordingly, the computer analogue of derivation of the estimate also suffers from the same shortcomings, if not more.

In most programmed control systems, the principle of deriving the result is simple. Since in such systems, as a rule, only a single sample is used, the estimate is calculated simply: answered - plus, did not answer - minus. Then the number of pluses and minuses is reduced to a five-point scale and a score is displayed.

A similar principle for deriving an assessment, although it is primitive, however, in the case when all questions in the database are equivalent and of the same type, it also has the right to exist. However, the direct reduction of the number of positive and negative answers to a five-point system deserves serious criticism. It is generally accepted that the credit threshold for the assimilation coefficient is 70%. In the case under consideration, to get a test mark (i.e. "satisfactory"), it is enough to answer correctly 51% of the questions, to get a "good" mark - by 71%, to get an "excellent" mark - by 91%.

However, the above practice, as a rule, does not occur, since all developers of testing systems are aware of the unequal nature of questions in the database. There is another method, when the developers allow the teacher to determine the "weight", i.e. the relative importance of each question in the database.

This technique, despite its apparent effectiveness, also has its drawbacks. The fact is that from the point of view of pedagogical theory, there are no simple and complex questions (if we are talking specifically about questions, and not about mathematical and logical problems that require a multicomponent solution). A simple question will always be for someone who knows the answer to it. And difficult - for those who do not know the answer. Thus, placing the "weights" of questions, the teacher actually arranges them in accordance with his own ideas about their complexity, in accordance with his level of competence or incompetence.

However, there are questions that require more or less response time. It would be logical to assume that for each question, from the point of view of psychophysiology, a greater or lesser number of mental (so-called essential) operations can be spent. Determining this number, as a rule, is not difficult, in simple types of questions it is equal to the number of proposed answer options, and is completely amenable to automation.

Thus, at the moment there are two ways to determine the result of the answer - by correct or incorrect answers to the question as a whole and by significant transactions. When choosing the evaluation principle, it should be assumed that the evaluation of significant transactions is more flexible and objective, since it allows you to identify incomplete, not entirely correct, partially erroneous and other similar answers and calculate them in specific numbers of the assimilation coefficient.

The flexibility of using the estimation method for essential operations lies in the possibility of introducing the so-called "soft estimate". The answer-based scoring system in general always uses "hard scoring"—i.e. if the student makes a mistake, the whole question is not counted. However, this assessment method is not justified for all questions. For example, in a large part of questions that have several correct answers (a selective question type is implied), it is not necessary to mark all the correct answers in full. In such questions, either a partially correct answer, or, conversely, the absence of a wrong answer is quite acceptable. The use of the principle of evaluation by significant operations allows in such questions to determine the coefficient of correctness of the answer and to count partially correct answers.

Conclusion

One of the significant trends in the development of education is the search for innovative methods of knowledge control that meet the requirements of objectivity, reliability, and manufacturability. At the present stage, among the effective methods for assessing the abilities and achievements of students, an important role is given to computer control of knowledge, which is now successfully used in educational institutions of various levels - from schools to universities.

Compared with traditional forms of control, computer testing has a number of advantages: quick receipt of test results, freeing the teacher from the laborious work of processing test results, unambiguous fixation of answers, confidentiality in anonymous testing.

After analyzing the modern literature on this issue, the following requirements for a unified automated testing system were identified:

    protection against unauthorized access to test questions. The solution to this problem can be carried out by means of data encryption;

    unlimited test base, which is designed both for test diversity and for less repeatability of questions;

    simplicity of the program interface. Many specialists, especially whose specialization is not related to information technology, are quite poorly able to handle a computer and computer programs, so the clarity and accessibility of the interface is an important requirement for a testing system;

    ease of test administration. This requirement is also important. The easier the development environment for themes and tests, the less questions will arise regarding the work on the computer. Ease of administration is solved by using a separate program for creating or adding topics and tests to the database and setting parameters;

    full automation of the testing process. Testing should be carried out without the control of the teaching staff over the course of testing. Therefore, the whole process - from asking questions of the test by the teacher, identifying a specialist, conducting testing, to assessing the result and entering this result into a data file, must take place in a completely autonomous mode;

    download speed. This criterion is important for computers with low performance. A person should not wait for a question to load for a long time. Each picture, graph must be optimized or compressed. They should not contain redundant information, but include only the necessary part;

    portability to different platforms with Microsoft Windows GUI support;

    accounting of appeals. Each test run should be recorded to ensure control. This is necessary to account for failed test attempts if the test was interrupted for any reason. This will provide control over user actions;

    targeting non-programming users. The use of the test program should not require experience with other applications;

    the testing system must support multimedia files (graphics, video, sound, animation). This is necessary to ask complex questions, for example, to display graphs, drawings, videos, etc.

An analysis of the literature made it possible to identify the following types of computer knowledge control questions: arbitrary type, or keyboard input; entering several answers in a certain sequence (ranking); entering missing parts of lines or letters; selective question type; alternative question type; sequential question type. For effective control of knowledge, it is necessary to correctly use all types of questions.

Currently, there are two ways to determine the result of an answer - by correct or incorrect answers to the question as a whole and by significant transactions. When choosing the evaluation principle, it should be assumed that the evaluation of significant transactions is more flexible and objective, since it allows you to identify incomplete, not entirely correct, partially erroneous and other similar answers and calculate them in specific numbers of the assimilation coefficient.

The test system has the following important characteristics:

    adaptability, i.e. the ability of the system to adapt to changing conditions (hardware and software);

    openness is determined by the ability of the system to adapt to the control of specific academic disciplines;

    standardization of the system is expressed by the use of functions and design used in programs of common use;

    unification lies in the fact that on the basis of this system, you can create similar ones.

The knowledge control system implemented in the course of this study is an automated support for independent work of students, which allows monitoring and self-control of the level of assimilation of the material, acting as a simulator in preparing for exams.

The developed knowledge control system will solve the problem of automating the creation of tests and testing procedures, and can be used to control the process of mastering the material of various academic disciplines by students.

Compared to other forms of knowledge control, testing has its advantages and disadvantages.

Advantages

    Testing is a more qualitative and objective method of assessment, its objectivity is achieved by standardizing the procedure for conducting, checking the quality indicators of tasks and tests as a whole.

    Testing is a fairer method, it puts all students on an equal footing, both in the control process and in the evaluation process, practically eliminating the subjectivity of the teacher. According to the English association NEAB, which deals with the final assessment of students in the UK, testing can reduce the number of appeals by more than three times, make the assessment procedure the same for all students, regardless of place of residence, type and type of educational institution in which students study.

    Tests are a more voluminous tool, since testing can include tasks on all topics of the course, while the oral exam usually has 2-4 topics, and the written one - 3-5. This allows you to reveal the knowledge of the student throughout the course, eliminating the element of chance when pulling out a ticket. With the help of testing, you can establish the level of knowledge of the student in the subject as a whole and in its individual sections.

    The test is a more accurate tool, so, for example, a test assessment scale of 20 questions consists of 20 divisions, while the usual knowledge assessment scale has only four.

    Testing is more efficient from an economic point of view. The main costs during testing are for the development of high-quality tools, that is, they are of a one-time nature. The cost of conducting the test is much lower than with written or oral control. Testing and monitoring the results in a group of 30 people takes one and a half to two hours, an oral or written exam - at least four hours.

    Testing is a softer tool, they put all students on an equal footing, using a single procedure and common assessment criteria, which leads to a decrease in pre-examination nervous tension.

Flaws

    The development of high-quality test tools is a long, laborious and expensive process.

    The data obtained by the teacher as a result of testing, although they include information about knowledge gaps in specific sections, do not allow us to judge the reasons for these gaps.

    The test does not allow to test and evaluate high, productive levels of knowledge related to creativity, that is, probabilistic, abstract and methodological knowledge.

    The breadth of coverage of topics in testing has a downside. The student during testing, unlike the oral or written exam, does not have enough time for any in-depth analysis of the topic.

    Ensuring the objectivity and fairness of the test requires the adoption of special measures to ensure the confidentiality of test items. When re-applying the test, it is desirable to make changes to the tasks.

    There is an element of randomness in testing. For example, a student who did not answer a simple question may give the correct answer to a more complex one. The reason for this can be both an accidental mistake in the first question, and guessing the answer in the second. This distorts the test results and leads to the need to take into account the probabilistic component in their analysis.

It was the last of the above shortcomings that prompted a little research. I was wondering what average score a student can get by answering the test questions at random. A group of students was offered ten tests on various topics, which differed in the number of questions, interface and implementation method. Since it was the probabilistic component of the test that was studied, the students answered without thinking about the essence of the question. The control measure of each test was the percentage of correctly completed tasks. A total of 120 measurements were made. Control values ​​ranged from 5% to 64%. Average of all measurements = 28.10%

PROS AND CONS OF TESTING AS A METHOD
Pustynnikova Yu.M.
"New Markets", No. 6, 2002

Personnel assessment has been and remains one of the most important elements of the personnel management system: it is impossible to do without assessment either in the selection of personnel, or in certification, the creation of a personnel reserve, and the rotation of personnel. Often, the effectiveness of the entire HR system depends on the effectiveness of personnel assessment. The effectiveness of personnel assessment directly depends on the adequacy of the methods and approaches used. Is testing, so fashionable in the early 90s, always the same adequate method?

At one time, when HR management in our country took its first steps, most HR managers were recruited from psychologists who directly transferred with them to a new field of activity the usual methods of scientific activity - tests. This is quite understandable - in those days they did not know anything else and did not know how, information about Western technologies for working with personnel leaked out “a teaspoon per hour”, their methods had not yet been developed.

To maintain their credibility and not lose their jobs, some psychologists gave candidates 300-600 questions each to fill out a battery of clinical tests. Of course, such a selection made an indelible impression. Both for candidates and employers. Yes, and on the "HR managers" themselves. In addition, the output is “objective” data. Apparently from there the myth of the omnipotence of tests originates.

Unfortunately, this is just a myth. The use of tests for scientific purposes has a number of limitations, while the use of testing in business is doubly limited.

Traditionally, the advantages of testing include the standardization of methods, the presence of a normative result, and its reproducibility. It is believed that the data obtained during testing are objective. Also, many managers are impressed by the scientific nature of the evaluation procedure in the case of testing.

However, almost all of these advantages have a “reverse side of the medal”. Let's start with standardization. Far from all the methods used by HR managers are truly standardized (tested on a large, reference sample, which confirmed that for people with the same pronounced test trait, the test results will be the same), very often amateur and popular science tests are used in personnel work. Moreover, standardization in itself is not yet a guarantee of quality: as a rule, tests are standardized on students, and no one can guarantee that the norm, say, of anxiety among students, accountants and, for example, customs brokers will be the same.

The objectivity of the data obtained through testing can also be questioned. Most of the tests used in personnel assessment are questionnaires; not all of them are equipped with a lie scale. The bulk of these questionnaires were designed for research purposes, testing was voluntary, or at the initiative of the subject, so the lie scale was not provided, or was poorly protected: the subjects had no reason to lie. Therefore, for a person with a higher education (which means a sufficiently high level of intelligence), “passing” such a test is not a problem, especially if the success of passing the test determines whether he will be accepted for a promising job.

In addition, cumbersome questionnaires require a lot of time to complete, process and interpret. Naturally, a person who spends a lot of time and effort filling out tests begins to feel annoyed with the company and the people who subjected him to such a “test”. As a result, the image of the company deteriorates, the loyalty of employees decreases.

By and large, psychological testing in personnel work makes sense in two cases: when assessing the professional suitability of specialists in a number of areas that have special requirements for cognitive functions (attention, memory, thinking, emotional sphere, etc.) of a professional (accountant, dispatcher, pilot, etc.). etc.) and with a large flow (mass recruitment or certification of specialists of the same type), when the speed of assessment is necessary and the ability to compare the result becomes of great importance.

At the same time, many characteristics that are in great demand in the labor market (corporatism, loyalty, constructiveness, customer orientation, etc.) cannot be reliably identified using tests. And it is impossible to determine whether a candidate will fit into the organizational culture of the company by any methods other than observation and conversation. In addition, it is far from always possible to establish a direct connection between the presence of certain psychological qualities in a candidate and his professional success, and the absence of a number of professionally important qualities can be compensated for by experience and an individual style of activity. In general, fixation on identifying a predetermined set of characteristics limits the range of information that can be obtained during the survey.

In general, the use of questionnaire tests requires less competence in the field of psychology from the personnel manager than projective methods, observation and interviews, since the results of testing as a method are minimally dependent on the skill of the researcher. However, the lack of proper competence can lead to the fact that not what was planned is measured due to inadequate choice of method. Often the test that the researcher is good at or is accustomed to using is used rather than the one that fits the situation. Many people have probably encountered the fact that the MMPI clinical test, created to detect severe mental pathologies from the field of major psychiatry, was used to select and evaluate managers, sales representatives, insurance agents, bank employees .. Even aside from ethical issues, the adequacy of the application This method outside the clinic is, to put it mildly, very doubtful. And the use of the Rorschach test (an even more complex projective clinical test that takes several years to master) in marketing focus groups (imagine this happens) is simply shocking. As practice shows, much more adequate and informative results in assessing professionalism can be achieved with the help of specially designed, structured interviews, case method and assessment center.

In terms of the variety of information provided, testing as a method significantly loses to such methods as conversation and observation. For all its seeming simplicity, ingenuity, bias and “unscientific”, a half-hour conversation can give an experienced psychologist or manager more information about a person than a half-hour test.

However, there are 3 main categories of tests that can be successfully used by the personnel department. These are projective, professional and cognitive tests. Projective tests provide a lot of various information about a person, do not require much time to pass, and it is very difficult to “deceive” them, since these methods rather appeal to the unconscious, having little contact with our conscious attitudes and beliefs. That is why projective techniques, among other things, are the best method for determining serious mental pathologies of an organic nature that may not be detected in observation and conversation. Cognitive tests allow you to evaluate the features of cognitive functions: distribution of attention, stress resistance, reaction speed, etc. Professional tests, as a rule, are not actually psychological. They allow you to assess the level of professional knowledge of a specialist.

In conclusion, I would like to recall that the test data, as well as the refusal to pass the test, according to the current legislation, cannot be the reason for the refusal of the applicant or employee in the workplace.

Computer testing is increasingly used in pedagogical practice. Perhaps soon it will almost replace traditional methods (such as "pencil-paper"), since it has clear advantages over them. What are they?

1) The computer version of testing saves a lot of time (this is probably the most important thing). The test subject's task is to simply press the key corresponding to the selected answer. The received data is automatically calculated, processed, evaluated and interpreted. As a result, the computer produces a finished report, often accompanied by diagrams, graphs and other visual images. The whole procedure, including the processing and interpretation of the results, takes much less time than with conventional testing. Such time savings are especially valuable when working with a group of testees - you can simultaneously seat a large number of people at the computer and quickly get the necessary data.

2) The strength of the tester is saved - he does not have to do very tedious routine work (instructing the testee, issuing tasks, keeping a record, counting and processing the results).

3) In the presence of a well-functioning program, computer testing practically eliminates errors in processing the results - the machine always uses the same algorithm, it is not distracted and does not get tired.

4) It becomes possible to accumulate and save an electronic database. The unified database is convenient for analysis and replaces huge piles of experimental forms, reports and conclusions.

5) When using a standardized computer program, the conditions for testing do not depend on the individual characteristics and psychological state of the experimenter, which undoubtedly increases the "purity" of the diagnostic procedure.

6) During computer testing, the subject, remaining alone with the computer, can afford to be more frank and natural. He has no one to be ashamed of - the "piece of iron" can neither evaluatively nor emotionally react to answers that are not the most successful, from the standpoint of social desirability.

7) Thanks to computer testing, it is possible to increase information and prevent the declassification of the test due to the high speed of information transfer and special protection of electronic files. The procedure for calculating the resulting scores is also simplified in cases where the test contains only tasks with a choice of answers.

8) The advantages of computer testing are also manifested in the current control, with self-control of self-training of students; thanks to the computer, it is possible to immediately issue a test score and take urgent measures to correct the assimilation of new material based on the analysis of protocols based on the results of corrective and diagnostic tests. The possibilities of pedagogical control in computer testing are significantly increased due to the expansion of the range of measured skills and abilities in innovative types of test tasks that use the diverse capabilities of a computer when including audio and video files, interactivity, dynamic problem posing using multimedia tools, etc.

9) Thanks to computer testing, the information capabilities of the control process increase, it becomes possible to collect additional data on the dynamics of passing the test by individual students and to differentiate between missed and not achieved test items.

10) Finally, it disappears, the most routine part of the work is the preparation of forms, the provision of methodological material, etc., since the entire methodology is presented in the form of a computer program. It's convenient in every way.

In addition to the advantages of computer testing, there are a number of disadvantages:

1) Typical psychological and emotional reactions of students to computer testing.

Communication between a person and a computer has its own specifics, and not everyone is equally calm about computer testing. For example, if the testing procedure is delayed or the content of the test does not interest a person, a positive attitude can be replaced by the opposite: they will tire and annoy the monotony and monotony of work, the "stupidity" of questions and tasks. Sometimes a negative attitude towards computer testing is also caused by the lack of feedback. And when the tested person does not receive feedback, the probability of erroneous answers increases (you can misunderstand the instructions, mix up the keys for answers, etc.).

Special studies have been conducted to determine how people feel about computer testing. It turned out that some people experience the so-called psychological barrier effect, and some people experience the effect of overconfidence. It happens that a person is not able to cope with the task at all because he is "afraid" of the computer. It is also possible to include psychological defense mechanisms associated with the unwillingness of the test person to reveal himself, the desire to avoid excessive frankness or deliberate distortion of the results.

Negative reactions usually cause various restrictions, which are sometimes imposed when issuing tasks in computer testing. For example, either the order in which tasks are presented is fixed, or the maximum possible time for completing each task, after which, regardless of the desire of the subject, the next test task appears. In adaptive testing, students are dissatisfied with the fact that they do not have the opportunity to skip the next task, review the entire test before starting to work on it, and change the answers to previous tasks. Sometimes students object to computer testing because of the difficulties that arise in doing and recording mathematical calculations, etc.

2) Impact on test performance of previous level of computer experience.

The results of foreign studies have shown that the experience of working on computers that schoolchildren have in many cases significantly affects the validity of the test results. If a test includes non-innovative multiple-choice items, the impact of computer experience on test results is negligible, since students in such items do not require any complex actions in taking the test. When presented on the screen with innovative types of tasks that make extensive use of computer graphics and other innovations, the influence of previous computer experience on the test score becomes very significant. Thus, in computer testing, it is necessary to take into account the level of computer experience of students for whom the test is intended.

To reduce the influence of computer experience on test scores, it is recommended to include special instructions and training exercises for each innovative form of tasks in computer testing shells. It is also necessary to familiarize students with the interface in advance, conduct a rehearsal and allocate students who do not have sufficient experience with a PC to independent groups in order to additionally train them or give them a blank test.

3) Influence of the user interface on the results of computer testing The user interface includes the functions available to the student and the ability to move through the tasks of the test, the elements of placing information on the screen, as well as the general visual style of presenting information. A good user interface should have clarity and correctness of the logical sequence of interaction with the examinee, reflecting the general principles of the design of graphic information. The more thoughtful the interface, the less attention the student pays to it, concentrating all their efforts on completing the test tasks.

4) In computer testing, specialists deal only with the results obtained. They do not see the person being tested, do not communicate with him, and therefore do not have additional information about him, cannot find out his actual amount of knowledge. So the results obtained with the help of computer testing should be trusted with some reservations.