paper2/thesis.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449


\documentclass[man,a4paper,oneside,12pt,floatsintext]{apa7}

\usepackage{lipsum}

\usepackage[english]{babel}

\usepackage{csquotes}
\usepackage[style=apa,sortcites=true,sorting=nyt,backend=biber]{biblatex}
\DeclareLanguageMapping{english}{english-apa}

% some stuff for further formatting beyond the apa7 package
\usepackage{setspace}
%\doublespacing
\renewcommand{\baselinestretch}{1.1}

\usepackage{subcaption}

% https://tex.stackexchange.com/questions/9796/how-to-add-todo-notes
\usepackage[colorinlistoftodos,prependcaption,textsize=tiny]{todonotes}
% \newcommandx{\unsure}[2][1=]{\todo[linecolor=red,backgroundcolor=red!25,bordercolor=red,#1]{#2}}
% \newcommandx{\change}[2][1=]{\todo[linecolor=blue,backgroundcolor=blue!25,bordercolor=blue,#1]{#2}}
% \newcommandx{\info}[2][1=]{\todo[linecolor=OliveGreen,backgroundcolor=OliveGreen!25,bordercolor=OliveGreen,#1]{#2}}
% \newcommandx{\improvement}[2][1=]{\todo[linecolor=Plum,backgroundcolor=Plum!25,bordercolor=Plum,#1]{#2}}
% \newcommandx{\thiswillnotshow}[2][1=]{\todo[disable,#1]{#2}}

\addbibresource{bibliography.bib}

% adding new commands to make the citation more like the natbib commandos (e.g. \citep and \citet for different types of citations)
\newcommand{\citet}[1]{\Textcite{#1}}
\newcommand{\citep}[1]{\parencite{#1}}


\title{Modeling of Transfer in Complex Tasks}
\shorttitle{}

% if you prefer to use student ID instead feel free to use that instead of name.
\author{Niclas Andreas Dobbertin} % your name or stud ID
\affiliation{Technische Universität Darmstadt}
% \course{03-03-1416-se: Advanced Topics in Multisensory Perception and Action}
\professor{Prof.\ Frank\ Jäkel}
% \duedate{26.02.2024} % fill in the due date for the submission opportunity you are aiming for


% \authornote{
%   \noindent blah
% }

\abstract{A model which simulates learning also has to account for the effect transfer has on new skills.
  Learning a skill that shares steps with a previously learned one speeds up acquisition.
  This thesis presents an ACT-R model of a task used by \citet{Frensch_1991} to investigate transfer learning.
  It will give a general overview of learning in production systems and explain the components of the model.
  Due to bugs in the used ACT-R implementation no results can be presented, however pain points in working with ACT-R will be discussed to motivate future work.
}

\begin{document}
\maketitle

\section*{Introduction}

When trying to understand how humans learn, transfer learning is particularly interesting.
Skills acquired by training can speed up acquisition of different skill through some mechanism.
Modeling this mechanism needs to take in account all of the steps the mind goes through when solving a task to re-use, or rather transfer them to another task.
Unified Theories of Cognition are what \citet{newell1994unified} argues to be the approach to gain a complete understanding of the human mind.
Also called cognitive architectures, they combine all of the specialties of the mind into one single framework, that ideally completely mimics what the human mind does.
Using such an architecture, it should be possible to describe a task in detail and observe transfer learning to another task.

Transfer learning was previously examined by \citet{Frensch_1991} to differentiate the transfer effects between learning the components of a task and learning the composition of components in a task.
They used an experiment shown by \citet{Elio_1986}, which uses multi-step mathematical equations, which have to be learned in different ordering conditions.
To test transfer, one equation is swapped to a new one and the speed of learning it is taken.
This kind of task seems appropriate to model in a cognitive architecture to see how it predicts transfer learning.

ACT-R \citep{anderson2004} is an established cognitive architecture that uses productions to model procedures in the mind.
There are several methods that use these productions to describe learning.
\citet{Brasoveanu_2021} compared different reinforcement learning algorithms in one such method, although using a lexical task.
For this they created a re-implementation of ACT-R in python \citep{Brasoveanu_2020}, which seems like a good starting point to implement Elios task in a model.


\subsection*{Productions}

Productions decide how a production system behaves and what actions it takes.
A production consists of two parts, a condition and an action (Table~\ref{tab:exprod}).
All statements listed in the condition must be fulfilled to make the production eligible for selection.
In ACT-R, conditions check for specific variable values most of the time, but can also check if certain buffers are empty, full or had an error, e.g.\ when failing to retrieve something from declarative memory.
Only productions which have their conditions satisfied by the current state of the model can be selected by it.
Once a production has been selected, its action will be executed.
Productions in ACT-R change values of variables and start visual, motor and memory related processes.

If the conditions of multiple productions are satisfied, ACT-R chooses the production with the highest utility.
Each production starts with a baseline utility value, which gets updated by the model during its runtime.

\begin{table}[hb]
\caption{Example Production}
\label{tab:exprod}
\begin{tabular}{lr}
\toprule
\textbf{IF} \\
  variable1 = true \\
  variable2 = 10 \\
\midrule
  \textbf{THEN} \\
  variable2 = 9 \\
  press button \\
\bottomrule
\end{tabular}

\bigskip
\small\textit{Note}. A production consists of two parts:
  1. The conditions (\textbf{IF}), which must be fulfilled for the production to be available for selection.
  2. The actions (\textbf{THEN}), which are performed when the production is selected.

\end{table}

\subsection*{Learning}

There are a variety of methods production systems use to model learning.
ACT-R can adjust which production is given preference during selection or create new productions based on existing ones and the models state.

When multiple productions are applicable to the current state, the production that the model thinks is the most useful should be selected.
How useful a production is can be learned while the model is running and is modeled in ACT-R through a reinforcement learning like process called utility learning.

Oftentimes a series of productions need to be executed in order, this can be combined in to a single production which does all of the actions at once, saving time deciding on which production to use.
This method is called production compilation.
When two productions are successfully called in a row, a production compilation process is started and combines both into a single production, if possible.
Since the compiled productions are specific to the buffer values when the compilation was done, there can be many different combined productions of the same two productions.
E.g.\ a production starting retrieval of an addition fact and a production using the retrieved fact can combine into specific addition-result combinations, skipping retrieval (Shown in Table~\ref{tab:prodcomp}).

ACT-Rs subsymbolic system also models delays and accuracy of the declarative memory, where retrieving memories can fail based on their activation strength.
Activation strength increases the more often a memory is created or retrieved.
Learning new facts and increasing their activation strength is also part of the learning process in an ACT-R model.


\begin{table}[hb]
  \begin{subtable}[h]{0.30\textwidth}
    \caption{Production 1}
    \label{tab:prodcompa}
    \begin{tabular}{lr}
    \toprule
    \textbf{IF} \\
      operation = subtract \\
      argument1 = x \\
      argument2 = y \\
    \midrule
      \textbf{THEN} \\
      retrieve: x - y \\
    \bottomrule
    \end{tabular}
  \end{subtable}
  \hfill
  \begin{subtable}[h]{0.30\textwidth}
    \caption{Production 2}
    \label{tab:prodcompb}
    \begin{tabular}{lr}
    \toprule
    \textbf{IF} \\
      operation = subtract \\
      retrieve = z \\
    \midrule
      \textbf{THEN} \\
      press button: z \\
    \bottomrule
    \end{tabular}
  \end{subtable}
  \hfill
  \begin{subtable}[h]{0.30\textwidth}
    \caption{A Compiled Production}
    \label{tab:prodcompc}
    \begin{tabular}{lr}
    \toprule
    \textbf{IF} \\
      operation = subtract \\
      argument1 = 3 \\
      argument2 = 1 \\
    \midrule
      \textbf{THEN} \\
      press button: 2 \\
    \bottomrule
    \end{tabular}
  \end{subtable}

  \caption{Production Compilation}
  \label{tab:prodcomp}

\bigskip
\small\textit{Note}. Table~\ref{tab:prodcompa} shows a production with the condition that the operation variable must be ``subtract'', and argument1 and 2 must have any values x and y.
If selected, it starts retrieval of the result of $x - y$ from declarative memory.
Production 2 (Table~\ref{tab:prodcompb}) is selected when the operation value is subtract as well, and the retrieval variable is filled with a value z.
It then starts a motor process to press button z.
When the model executes both productions after another, it starts the production compilation process with the current model state.
E.g.\ in Table~\ref{tab:prodcompc}, if argument1 was 3 and argument2 was 1, it creates a new production which skips retrieval to directly press the result, if the same model state happens again.
That means for each combination of x, y and z a different specific production can be created.

\end{table}

\subsection*{Task}

To investigate model behavior and potentially compare it to results from human experiments, it was decided to use an adapted version of the setup described in \citet{Frensch_1991}, which was first used in \citet{Elio_1986}.
Subjects are put in charge of determining the quality of water samples by performing simple mathematical operations with given indicator values per water sample.
A water sample has an algae, a solids and multiple toxin and lime values, which are randomly generated for each sample.
There are six different 2-step equations that use these values and a seventh equation using all previously calculated results to determine the final result (see Table~\ref{tab:proc}).
To solve a procedure, subjects have to locate the values of used variables on the screen.
Some variables show multiple values, procedures using them indicate how it should be selected after an underscore.
For example x\_2 means taking the second value of variable x.
Other procedures require finding the maximum or minimum value of a variable or of previous solutions.
An example of how the screen could look during a trial is shown in Figure~\ref{fig:frensch}.

The experiment starts with 75 acquisition trials, each representing a water sample, in which a random choice of 6 procedures has to be solved in the order they are presented.
The last procedure is always picked in the selection process, as it uses all previous results for a water sample to calculate the final solution.
Afterwards 50 transfer trials take place, in which the third procedure from the acquisition phase is switched for the unpicked one.
There are three conditions that determine the order in which procedures are presented in the acquisition phase, however the procedure for the final result is always last.
In the fixed condition, the order is randomized once at the start and stays constant during all trials.
In the random condition, procedure order is randomized between each trial.
In the blocked condition, the first procedure has to be solved for all trials before moving on to the second procedure, etc.
The transfer phase always uses fixed order.


How modeled:

\begin{table}[hb]
\caption{Experiment Procedures.}
\label{tab:proc}
\begin{tabular}{c}
\toprule
Procedures \\
\midrule
  $(Sandstein_4 - Sandstein_2) * Mineralien$ \\
  $(2 * Algen) + Sandstein_{min}$ \\
  $Gifte_{max} + Gifte_{min}$ \\
  $(Mineralien * 2) - Gifte_4$ \\
  $Das Höhere\, von\, (Gifte_3 - Gifte_2), (Sandstein_3)$ \\
  $Das\, Kleinere\, von\, (Sandstein_1 + Gifte_1), (Algen)$ \\
  $100 - dem\, Höchstem\, aller\, Ergebnisse$ \\
\bottomrule
\end{tabular}

\bigskip
\small\textit{Note}. The seven translated procedures used in this experiment.
Six of them are used in the acquisition phase, in the transfer phase one procedure is swapped with the unused one.
The bottom procedure is always included as it calculates the total water quality.
\end{table}

\begin{figure}[H]
    \centering
    \caption{Screenshot of experiment display}
    \label{fig:frensch}
        \includegraphics[width=1.1\textwidth]{exp_screen.png}

    \bigskip
    \raggedright\small\textit{Note}. Example water sample presenting in an experiment using the adapted task from \citet{Frensch_1991}.
    In the first procedure, a subject has to find the smaller of $Sandstein_{1} + Gifte_{1}$ and $Algen$.
    First they need to find the value of $Algen$ and the first values in the lists of $Sandstein$ and $Gifte$ to substitute them into the equation.
    Next they can calculate the sum inside the parenthesis and put the smaller value between it and $Algen$ as the result.

  \end{figure}

\section*{Model}

The goal of the model is an accurate representation of how a human would solve this task and improve over time.
Optimally the models solving time would, in each condition, improve similarly to previous human results.
Looking at the ways an ACT-R model can improve, production compilation seems to be the important function compared to utility or chunk learning.
A lot of small subtasks have to be accomplished for a single trial, such as finding correct variable values, solving multiple mathematical operations and typing the answer.
These steps however need to be repeated for each trial and while the numbers and with them the mathematical operations can change a bit, the overall order and structure of subtasks stays the same.
Production compilation therefore promises strong improvements to solving times, as many steps can be combined into a single one, eliminating time deciding on the next step.
Additionally numbers in this task are often small, allowing some common operation to be saved as productions in procedural memory, removing time calculating or trying to retrieve from declarative memory.

Utility learning is needed to evaluate the usefulness of compiled productions, but since the task and subtask order is very rigid, should have no important role in learning otherwise.
Chunk learning doesn't seem impactful either, as there are too many permutations of variable values and too few trials to memorize helpful information.

To complete the experiment in a manner a human adult would, the model is given a baseline of knowledge and skill to start with.
This includes basic knowledge of possible numbers and mathematical operations it has to solve.

\subsection{Implementation}
The model was made using the ACT-R architecture \citep{anderson2004} through the pyactr \citep{Brasoveanu_2020} implementation.
The base model uses default parameters.
To enable production compilation and utility learning, the parameters ``production\_compilation'' and ``utility\_learning'' have to be set to ``True''.
Due to implementation details in pyactr, the subsymbolic system has to be enabled as well.
Issues and workarounds when implementing the model will be reviewed in the Discussion.

The model works with four different types of chunks specified.
Number chunks hold the number, its digits and the number one higher.
Math operation chunks hold an operation, two arguments and a result.
Procedure chunks hold the operations, variables and values that make up a procedure in the experiment.
The math goal chunk is used in the goal buffer and hold various slots used for operations, like the current operation, arguments, counters and flags.

The model gets some basic knowledge that does not have to be learned in the form of chunks set at model initialization.
It knows each procedure already and can retrieve its operations and values with an key.
It still has to find the right key by visually searching for the current procedure on the screen
It knows all numbers from 0 to 999 through the number chunktype.
It has math operation chunks for all greater/less comparisons for numbers between 0 and 20.
It has math operation chunks for addition of numbers between 0 and 20.

All trials are generated before the simulation starts and ordered depending on condition.
The model uses an environment to simulate a computer screen.
Elements are arranged in columns with the values in rows below their column header.
Every time the user inputs an answer or the variables change, the environment variables are directly updated.
User input and trial change is detected from the model trace.

The model works through the tasks with a set of productions, which perform mathematical operations, search the screen, input answers and organize order of operations.

\subsubsection*{Greater/Less-than Operation}

This pair operations compares two multi-digit numbers and sets the greater/less number as answer.
For each digit (hundreds, tens, ones) there is a set of productions comparing that digit of the two numbers.
Each production set for a digit requires all higher-significant digits to be equal.
That means that the productions comparing the tens can only fire if the hundreds are equal and the productions comparing the ones can only fire if both the hundreds and the tens are equal.
The selected production now retrieves a comparison of the two digits from declarative memory.
Depending on the result, either number 1 or number 2 will be written into the answer slot.

\subsubsection*{Addition Operation}

This operation adds two numbers through column-addition.
The first production retrieves the sum of the ones digits of the two numbers.
The sum is put into the ones digit of the answer.
Next it tries to retrieve an addition operation from memory, where 10 plus any number equals the previously found sum.
If the retrieval fails, the result of the ones addition was less than ten and no carry-over is necessary.
If the retrieval succeeds, a carry flag is set and the second addend of the retrieved operation (the part over 10) is set as the ones digit answer.
Now the sum of the tens digits of the numbers is retrieved.
If the carry flag is set, add 1 to the sum.
Again check for remainder and set a carry flag if necessary.
Then the same repeats for the hundreds digits.

\subsubsection*{Multiplication Operation}

This operation multiplies two numbers through repeated addition.
Multiple productions handle cases in which one of the arguments is 1 or 0 and directly set the answer accordingly.
First, it tries to retrieve the sum of the second argument plus itself and sets a counter to 1.
If the retrieval succeeds, set the answer to the sum and increment the counter by 1.
While the counter is not equal to argument 1, retrieve the sum of argument 2 plus the result and increment counter.
If the counter is equal to argument 1, the operation is finished.
If the retrieval of the sum fails, save arguments and counter in different slots and change the current operation to addition, as well as the next operation to multiplication.
When the current operation is multiplication again and there are values in the saved argument slots, restore arguments and continue.

\subsubsection*{Subtraction Operation}

The subtraction algorithm uses the austrian method, by checking for each digit if the subtrahend is greater than the minuend.
If not, it can safely subtract the two digits and move to the next one.
If yes, the subtraction will be done after increasing the minuend by 10.
Additionally a carry variable will be set, which increases the subtrahend by 1 on the next digit.

\subsubsection*{Motor System}

The motor module is used to input the answers and to press continue.
When the current operation is to type the answer, the first production requests the tens digit to be pressed on the keyboard.
When the action is finished, the ones digit and space bar to continue are requested to be pressed in turn.

\subsubsection*{Visual System}

The visual module is used for various operations to find the current task or to replace variables in a task with the values shown on the screen.
The screen is organized in columns with headers, so the visual module first searches for the correct column by keyword.
Now different kinds of searches will be performed dependent on what is requested.

To find the next task, the search goes down the column of tasks and saves the task at the current row.
If there is nothing in the answer column at the same y-coordinate, the currently saved task was not answered, the search is done.
To find a variable value by index, the search travels down the column while counting and stops at the desired index.
To find the max/min value of a variable, the search travels down all values in the column, checks for each one if it is greater/less than the currently saved value and replaces it if necessary.
Once all values are checked, the search is finished.

\subsubsection*{Utility Operations}

Several productions dictate in what order operations are executed.
When the operation slots are empty, the visual search for the next unanswered task is started.
When a task is found, productions check if the argument are already numbers and if not, request the visual search for substitution with the values on screen.
When a task is finished, the result is saved in a slot and other slots are reset, starting task search again.
If the second task is finished, start the motor input of the answer.

One production detects if the current operation is finished and another operation is queued, and sets the next operation.

Since operations use both the full numbers and their digits, a set of productions fills digit slots with the digits of a number and vice versa.


% \begin{figure}[H]
%     \centering
%     \caption{Logic Flow of Addition}
%     \label{fig:addition}
%         %\includegraphics[width=1.1\textwidth]{frensch.png}

%     \bigskip
%     \raggedright\small\textit{Note}. When each production is executed depending on state. Either example for one operation or figures for all?\end{figure}

\section*{Results}

Without enabling the subsymbolic system and its learning algorithms, the average time the model takes to solve a specific procedure stays the same over the experiment.
This is expected; while each finished mathematical operation does get remembered by the model, the amount of argument with operation permutations is too high to be useful in this few trials.

Due to multiple roadblocks in working with the subsymbolic system in pyactr, it was not possible not simulate a full experiment run with it enabled.
Details about these difficulties will be reviewed in the Discussion.

% \begin{figure}[H]
%     \centering
%     \caption{Mean solution time in acquisition and transfer phase}
%     \label{fig:RT}
%         % \includegraphics[width=1.1\textwidth]{RT.png}

%     \bigskip
%     \raggedright\small\textit{Note}. Mean solution time of all six procedures of a water sample in blocks of five samples.
%   \end{figure}


% \begin{figure}[H]
%     \centering
%     \caption{Comparison with human experiment}
%     \label{fig:RTcomp}
%         % \includegraphics[width=1.1\textwidth]{RT.png}

%     \bigskip
%       \end{figure}

\section*{Discussion}

This model shows that it possible to implement the task in ACT-R and that it should be possible for the model to produce task solving time for comparison with human subjects.
During development however, a variety of difficulties emerged and ultimately prevented ACT-Rs learning functions to simulate human learning.
An ACT-R implementation in python, pyactr, was used to program this models, which brought some pyactr-specific problems with it.
Difficulties in re-implementing ACT-R were already mentioned in \citet{albrecht2014}, who state that today ACT-R is specified by its implementation, rather than a formal specification.
The implementation of production compilation in pyactr seems to include some critical bugs, causing the model to crash when compiling some productions.
While it showed that production compilation works in most cases, this stops it from being utilized in a model and prevented us from investigating its effect in our task.
Another problem was the missing implementation of relative coordinates in visual search, meaning scanning objects left to right for a specific one is not possible and had to be circumvented by hard-coding all possible object positions to search.
Since it is developed by very few people, it is sadly natural that specific parts and usages are not working correctly, despite being functional in general.

In general, it was not clear how a production or set of productions has to be written in order to achieve some task correctly.
While ACT-R gives a lot of tools to handle many situations, it was surprising that even even basic operations like multi-digit addition or subtraction do not have an example implementation.
To use the goal buffer or the imaginal buffer, how to sequence tasks, how general or specific should productions be and how much strict order should the goal buffer enforce were important considerations during development and had to be answered more by feeling than by knowledge from references.
There are various models used in ACT-R tutorials to introduce its capabilities, these however are very limited and don't expand beyond very simple tasks.
Papers usually don't include the exact model and productions used, which leaves few examples and general guidelines to new model makers.
Implementing new models would be much easier, if something similar to a software library exists for ACT-R.
It could contain simple, common tasks like for example mathematical and lexical operations, visual search and handling task switching or subgoals.
Such a library would additionally serve as an example of proper implementation of different productions in ACT-R, giving guidelines to newer model makers.


\subsection*{Model Improvements}

Most importantly, solving the production compilation problem and actually comparing the models learning behavior with human data would be the next step from this point on.

While model currently does not work correctly, there are a variety of improvements possible after technical issues are removed.
Mathematical operations could be modeled much more general and to work with higher and negative numbers.
This would make it possible to learn mathematical facts from the ground up, instead of relying on a set of given knowledge.
Introducing multiple ways of doing an operation, like addition by counting from 1 or from the first argument, as well as shortcuts like swapping arguments in applicable operations, gives the model opportunity to utilize its utility learning more.
Another important improvement would be better switching between tasks, as e.g.\ multiplication requires additions being performed.
This required a complex set of production, which a general task switching implementation could simplify.

It would be interesting how other cognitive architectures behave in comparison to ACT-R.
SOAR \citep{laird2022introductionsoar} and especially PRIMs architecture \citep{Taatgen_2013}, which specializes in transfer of knowledge through small knowledge bits.


\printbibliography{}

% \end{figure}


\end{document}