Creation and Evaluation of a Multimodal touch based Application for Learning the Physics of  Motion

 

Abstract

Introduction

Theoretical background and related work

Graphical representations of motion in physics education

Multiple and Multimodal Representations

Related Work and recent efforts

Research questions

The Application - method and design

Test design

The doer sessions

The watcher sessions

Results

Pre intervention test results

Post intervention test results

*

Analysis of results

Pre intervention test scores

Gender difference

The doers versus watchers hypothesis

The descriptive versus relationship hypothesis

Gesture versus no gesture hypothesis

Conclusions and discussion

Student activity and motivation

Gender difference

Discussion

Outlook and future efforts

References

Gör ett nytt X skicka till handledaren och sedan skickar vi det nya exemplaret till

Skriv syftet för sig... i eget avsnitt

Metoden: Kanske andra ämnen är intresserad. Snubblar lite kring fysiken att jag inte fattar allt... kanske kunde förklaras lite mer. Kanske förklarar mer vilka pedagogiska fördelar appen skulle kunna ha.

Bra med statistiken. Känns pålitligt. Längden är ok. Formen traditionell. Språket är stundvis tungt.

Frågor: Finns det någon nackdel? Med denna typ av appar? Använder man det till annat? Man skall inte glömma att studenterna kanske gör annat med tekniken.

Lite med engelska språket: Analys… analys analys.

“Learning till by doing”. ← Dewey lägg in referens. Eller kommentera.

“This is due to the ordinal nature of the data as compared to the interval type of data, e.g. the distribution of measured masses for a particle in a collider experiment. “

Om kvinnor nu är sämre - hur kan man använda det? Vad skall man använda det resultatet?

Omvänd... flipped classroom.

The knowledge taught with this inverted approach - är det inverted eller inte? Jag menar att man inte brukar göra det på det sättet, i den ordningen..

Doers... får ingen förklaring. Vad har de för förkunskaper?

Statistiken är svår, går det att göra enklare?

p = 5% Var kommer det ifrån? Varför har det blivit så?

Kursivera titlarna.

“Multimodaliteter... många problem just kring att hitta förklaringar - vilka är de korrekta förklaringarna till den observerade effekten”

Kanske ändra gender till sex.

Beskriv vad för typ av variation rose är ute efter

Skriva in att det är två olika koordinatsystem i texten... förtydliga det.

Istället för collider... skriv något mer vardagsnära.

Olika intelligenser? Gardner... sitter de och gör

Skriv att 5% (as the traditional choice)

Abstract

A prototype multimodal tablet application for learning the physics of motion has been developed tested and evaluated. The application enabled the user to map the position and velocity of their finger in real-time in terms of graphs by touching the screen. The learning outcome of those test subjects using the application was compared to a group that had the same procedure shown to them, as well as getting an explanation of all the physics involved. There was a small but not significant difference seen between these groups. However, a significant difference was seen between the male and female test subjects, especially in the sub-set of questions of a more analytical nature.  

Introduction

Recently there has been an explosion of the number of computers in Swedish schools, (Sverigekartan, 2011) where more than 50% of the community schools now plan 1-to-1 programs effectively putting a computer in the hands of each and every young learner. As of late, a similar explosion in the number of touch based devices in the hands of learners can be seen. This has opened up a new field for the design of touch based applications for education. Traditionally learning tools and material have relied upon textual representations combined with images, and in some cases moving pictures and sound. The learner is most often a passive consumer of the stuff presented. The new touch based devices such as the iPad, smartphones and android tablets makes possible a more interactive component to the learning experience. This can be seen in the ever growing number of “apps” available for children and young learners in the apple and android marketplaces. Often times these applications are still very much like the traditional learning material dressed in a digital robe, not fully taking advantage of the possibilities of the new technology. In cognitive science the research on multimodal representations of complex concepts in science stress the importance of a multitude of senses to be involved in the learning process. For example: iPad applications for learning physics are often written in a way that involve sound, text, images, and moving objects. But the interaction with the learner is still restricted to the activation of a specific movement or event. The actual movement or other physical process under study is still generated by the computer (such as the iPad) for the learner to observe and in some cases manipulate. The learner observe a moving object and might be presented with information about the motion in terms of graphs showing the speed versus time. But they do not produce the motion.

This thesis will present and evaluate a method for doing just this - studying the motion of your own hand in real time using an nowadays almost ubiquitous type of hardware. The importance of this kind of embodied multimodal learning experience is stressed by Donovan & Bransford, 2005, as well as Linn & Eylon, 201. Especially when it comes to different kinds of representations of mathematical abstractions of physical processes and events. With the introduction of touch and gesture based devices the students own body can now be made a much larger part of the learning experience, adding the sense of touch and movement to the list of representations and modalities.

This thesis will look specifically on the topic of physics education and how the new technology can be used to aid the students understanding of its mathematical representations. That there is room and need for new ideas when it comes to the teaching of physics in swedish schools is stressed by Skolverket (2009). Bridging the gap between the qualitative understanding and the ability to perform quantitative calculations in physics education (Skolverket, 2009) is essential for the high school students not to lose interest in the subject. The same can be said about the ability to compete with other sources of knowledge and information in the ever growing media arena on its different platforms, such as the new mobile devices in the hands of most students of today, and certainly tomorrow. We do need to catch the attention of the new generation of learners (Skolinspektionen, 2010). Taking the learning to the arenas where they already spend time and know their way around is one way to go about it. At the same time utilizing the possibility for a more broader range of senses to be involved in the learning process would potentially increase the learning outcome. If the students could interact with, manipulate and produce the objects of learning themselves, the idéa is that they would get more emotionally involved in the learning process. Make this perceptual type of learning process game-based, and the effect is enlarged even further. This would also provide part of the variation in the teaching of physics that the (swedish) pupils calls for (Skolinspektionen, 2010), and as well as the type of variation stressed by Rose (1985). 

The contribution of this thesis is the creation of and testing of the efficiency of an application specifically designed for learning the physics of motion using touch based consumer technology such as the iPad. Touching the screen, a distance-time and a velocity-time graph is produced simultaneously in real time at the tip of the user's finger (see image XXX).

The application will potentially aid learners in the understanding of these different graphical representations of motion. They will also learn to see the connections between them, and the connections to the actual movement they represent. The learning experience thus includes embodied interactions involving several sensory modalities and the design and envisioned usage scenarios is based on theoretical argument and research as presented in the next chapter.

The developed prototype application has been blind tested on high school students divided into two groups. In one group the students get to use the application without supervision. These students are the test group called “doers”. The other group of students gets the application shown to them and get instructions around the topics it illustrates in a more formal classroom setting. This group acts like a reference group and is called the “watcher” group. The hypothesis being that, although the watcher group gets more formal information and instructions on the physics involved (they get to know what is “right”)  the doer group will perform better on a post test due to the embodied multimodal personal interaction.

The end goal of this work is to enable learners to use the kind of application developed in both formal and informal settings. Motivating this goal are the findings of Anastopoulou, Sharples and Baber (2011). The aim of this thesis is hence to verify those findings in general, using the developed method specifically. It also aims to look into the proposed line of research from Anastopoulou et al. (2011) where it was observed that many of the test persons in the doer group made gestures with their hands during the posttest. As if they were actually using the apparatus used in that study. The ability to visualize the movements in the questions was proposed as a way to explain the results. The last of the research questions once again points towards a new line of argument for the wide spread of this kind of new applications.

The rest of this thesis continues with a theoretical background and an overview of related work, a description of the prototype application in a typical pedagogical scenario setting, a description of the quantitative test, the result of the test, an analysis of the results, some conclusions and a discussion around these. Finally, an outlook will lead into a more concrete list of proposed future research.  

Theoretical background and related work

At the present moment it is the author’s opinion that there is a lack of high quality physics applications taking learning- and cognitive science into account. In October of 2011, the Algodoo simulation software proved that there is a market for these types of applications, immediately jumping to the top grossing position in the Apple App store. This also implies that game like applications with an informal design and look could be the way to go in terms of spreading this kind of application, and hence their pedagogical content. So far, studies on the use of apps not specifically designed for physics such as that from Kelly (2011) also point in that direction.

Below I will review a number of papers that have been identified as pointing to real possibilities in tackling the challenge of teaching young learners the meaning of abstract mathematical representations within physics. This will further motivate how to match the availability and capability of the new types of touch- and gesture based personal devices with the suggestions from theory and recent studies. I will thus point at some solutions already being tested in the same area as the work presented here. First I will introduce the main theoretical concepts of importance for this kind of study.

Graphical representations of motion in physics education

The classic illustration of motion in terms of graphs are distance versus time (d-t) and velocity versus time (v-t) graphs (see image XXX) illustrating how an object's position and velocity changes with time. This study concerns learners ability to interpret and produce these kinds of illustrations, as well as finding the connections between them.

There is a multitude of sources for knowledge about learners understanding of physics concepts connected to the physics of motion, and graphing of motion in particular. Some of these sources also contain elements of the methods and theoretical keywords for this thesis. In Beichner (1990) it was found that the simultaneous production of graphs with the visualization of the motion did not in itself seem to have a significant effect on the students learning and understanding. It was pointed out that what was probably of importance would be the student being able to control and manipulate the motion connected to the graphical visualizations. In Lucia & Cicero (2010) the positive effects of the use of a motion sensor as a mean to aid students get the connection between real world motion and the abstract representations of these physical real world phenomena were studied. The method (application) tested in this thesis takes both these points of view into account as implemented in the application. The students do manipulate and control the motion represented by the graphical visualization. They do this by means of what could be called a motion sensor. The iPad screen sensing the motion of the students finger.

Multiple and Multimodal Representations

With Multiple representations of a concept like velocity, one generally refers to the different ways of illustrating that same concept. It could be done in terms of a formula, say , illustrating the change in vertical speed for a free falling body. It could also be done in terms of a movie showing the actual fall, a graph showing the velocity as a function of time, or as a series of vectors of different length. Ainsworth (2008) stresses the importance of the use of multiple representations such as graphical ones together with others when it comes to aiding students in understanding complex scientific concepts. Citing Kaput (1989) she argues that: “the cognitive linking of representations creates a whole that is more than the sum of its parts”.

Multimodal representations on the other hand generally refer to representations involving the different sensory modalities such as hearing, sight or touch, in combination with communicative modalities. Examples of communicative modalities are gestures, speech, or images.  The motion of an ambulance could be communicated using moving images or a table of values of position and time as perceived through the visual modality. Adding sound would add the possibility to experience the doppler effect as the ambulance passes. Standing at a zebra-crossing, you could actually sense the ambulance passing as the pressure wave hits you, involving yet another modality in your experience of the phenomenon.

In the test of the application developed for this study, the user is presented with a textual or graphical representation of motion. When they use the application they involve their visual modality, in coordination with tactile/kinesthetic modalities as they communicate and interact with the graphical representations using hand gestures. The learning experience thus involves the learners body more than just passively observing using its sensory modalities The learning becomes more of a “learning by doing” type. The intertwining of different modalities and the fact that the learner  produce the object of study with their body is the key idéa behind the proposed learning outcome, as drawn from the cognitive and learning sciences.

Related Work and recent efforts

In a recent paper, Anastopoulou, Sharples and Baber  (2011) demonstrated the above mentioned importance and power of getting the students and their bodies involved in the learning process. They studied how the creation and manipulation of representations involving multiple modalities and types describing and communicating the meaning of mathematical concepts of motion such as speed, distance, time and velocity affected the learning outcome. Being able to produce and manipulate graphs showing the displacement and velocity of their hands moving as a function of time was shown to significantly increase the learning outcome. They compared pre- and post-tests with students that did not involve their own bodies in the learning but instead watched a teacher perform the same action. The background to the study by Anastopoulou et al. (2011) comes from Papert (1980). Papert suggested that by using their own bodies to construct symbolic representations students can be aided in their learning. A suggestion further stressed by Cox (1999). I propose to build on and spread this kind of learning activity. This can be done in terms of touch- and gesture based applications for tablets, smartphones and digital whiteboards, is in line with the conclusions and suggestions from Anastopoulou et al. (2011). They propose that “Providing students with personal multimodal technologies may help them to engage in learning science concepts”.

The second major work that the suggested approach of this thesis and its further development is based upon is the study presented by Kellman, Massey and Son (2009). The idéa being that the methods presented in that paper could increase the effect presented by Anastopoulou et al. (2011). It would do so eg by enabling students to learn in a more informal setting on their own. In the study by Kellman, Massey and Son (2009), the learners were given the task to pair graphs with other representations such as equations and written text on a computer before receiving any formal instructions. The hope was that after many trials they would see the patterns and connections between the different representations and what they actually represent. This was indeed what was observed. The students were also observed  to pick up the more formal instructions and concepts of mathematical graphical representations easier. The knowledge taught with this inverted approach was also recorded to stick longer, compared to the traditional method where the students are presented with formal education first and problems solving second. Similarly, the suggested design in this paper hypothetically would let the learners find and see patterns and connections between graphs without the need of formal instructions.

Some efforts in the direction pointed out above have already been made. There are a few applications on the market, where some of the different aspects of touch capability, authentic and perceptual learning, multiple representations, ubiquity and mobility are addressed. One example is the Vernier Logger Pro desktop software and the quite recent Vernier iPad application (Vernier, 2011). Here the learner can film an object in motion, and using the iPads touch capability, extract graphical representations of the objects motion. Kelly (2011) have studied the use of publicly available game-like apps for iPod touch (mostly not especially made for physics studies) in middle school physics education. A great deal of benefit eg from the standpoint of the students engagement level was found. A study carried out by Barab and colleagues (2007) further stresses the importance of engaging the learners in embodied participation. The application presented in this paper could then be part of the sought for multitude of embodied practices by Barab et al. (2007). Furthermore, major efforts on the hardware and research side like that of Texas Instruments (2011) build upon similar ideas as those presented here. However, the aspects of embodiment - using multi-touch or body movement - and informal perceptual learning has not been realized in one application. Neither does any of the above efforts include for the learners to generate and interact with the graphical representations directly using their own bodies. Hence there is no real-time multimodal feedback. Recently however, the people at SmalLab Learning (SmalLab Learning, 2012) have come up with a combination of Kinect type hardware and software similar to that described in this thesis. In one of the SmalLab scenarios within the Flow concept, the learner move their hands up and down, with the task to match the motion with more abstract representations of motion such as graph or equations. SmalLab Learning also did research on their designs of combinations of hardware and software scenarios. The effects on learning, specifically looking at the effect of embodied learning was also evaluated with overall positive results as can be seen in (Birchfield & Johnson-Glenberg, 2010). At the time of design and production of the application studied in this thesis these later mentioned efforts from SmalLab were not known to the author, so most of the features and design aspects were derived from the research of Anastopoulou et al. (2011) and repackaged to fit and function in a touch based piece of hardware such as the iPad.

Research questions

In the work by Anastopoulou et al. (2011) they got one clear result - the students performed significantly better if they used their hands to generate graphical representations of motion in the learning process[1]. This is a significant finding in more than one way since the reference group got instructions and explanations on the physics involved and were guaranteed to get the right answers to the training questions, potentially giving them a head-start compared to the test group. Furthermore the results were solely due to the difference in score between the two groups on the more analytical questions aimed at understanding the relationships between the two different types of graphical representations (questions 6-9 in appendix 2).

Coupled to these findings is the first aim of this thesis as expressed in Research Question 1 and the corresponding Hypothesis 1 below. It can be seen as a confirmation of the findings of Anastopoulou et al. (2011). But, it is also an extension of the same study when it comes to the age group of the test persons and the actual technology and input method used. In this study an iPad application is visualising graphical representations of the concepts of motion (position and velocity versus time). The reference group consists of students experiencing a lecture where the same application is used and shown, and where the concepts and content are explained.

Research Question 1: Will the students using their hands in the learning process (the “doers”) will perform significantly better than those in the control group (the “watchers”) on the analytical type of questions?

Hypothesis  1: The test group - the doers - will perform significantly better compared to the reference group (the watchers) on a post test on the more analytical types of questions (6-9 in appendix 2) while no such difference will be seen for the more descriptive type of questions (1-5 in appendix 2).

The research question aim to verify the evidence that a multimodal learning experience, mixing simultaneous visual impression with hand gestures performed and experienced by the learner will improve his or hers learning. In particular, the questions aim at providing insight to the specific example of learning about the graphical representations of motion in physics using the learners own motion as the object of inquiry.

Mentioned in Anastopoulou et al. (2011) was the observation that many of the doers used their hands during the post-test, in such a manner as if they tried to recall or recreate  the motion of their hand during the learning session. To test if this might be the case - that using your hands to recall/recreate the motion in a test-scenario can help you perform better - a second research question and a corresponding hypothesis is tested.

Research Question 2: Will the test persons using their hands during the post test in such a way as to mimic the hand movement described in the questions perform better compared to the test persons that don't on the more analytical type of questions?

Hypothesis 2: Test persons moving their hands on the post-test in a similar fashion as when using the application will score significantly higher compared to those that do not on the more analytical type of questions.

This hypothesis could help find alternative explanations to the difference in results for doers and watchers. This hypothesis could help find alternative explanations to the difference in results for doers and watchers. In the original paper by Anastopoulou et al. (2011) there was no difference reported when it comes to the gender of the test persons. However, this is basically a default question to ask.

Research Question 3: Is there a difference in performance on the post test results between the male and female test subjects?

Hypothesis 3: There will be no significant difference on the post test results between male and female test subjects on the more analytical type of questions

Parallel to the above more theoretical questions, the evaluation of the developed application will also test a more practical question of pedagogical interest. Namely if it is possible for a young learner to grasp the abstract concepts of motion in a learning setting with little or no help or guidance from a teacher. This last research question will be evaluated in a more qualitative fashion looking at how well the students managed to cope on their own during the training sessions. Will they need support to get started during the training session? Will they need help using the different features in the application?

Research Question 4: Can this type of knowledge and understanding be obtained using an application on an iPad, without the aid of a teacher for explanations and guidance? How does the gained knowledge after this type of learning activity compare to that of a more teacher guided learning activity approach?

The Application - method and design

In Anastopoulou et al. (2011) the method of generating and interacting with the graphs is not the same as in this study. There, the learners hand motion was measured “directly” using a motion sensor attached to the test persons hand. The application developed as part of this thesis measures the speed of the learners motion in terms of the finger moving across the screen of a tablet. The motion is then turned into a velocity- and distance versus time graph respectively.

        The Velocity application was developed as a native iOS application to run on an Apple iPad. Therefore the programming language Objective-C as well as the Cocoa Touch Framework were used. All the programming was done using the IDE Xcode 4.x provided by Apple. It was build to run on devices with the major release of iOS 5. The graphical representation of the Velocity application is based on OpenGL ES.

Apart from including the proposed functionality, the main ideas for the application from a design point of view was the following:

This would enhance the potential for learning in a more informal settings. There is a large number of features that could be implemented to support the basic pedagogical idéas behind learning about motion and graphs. The prototype application used in the evaluation phase of this thesis included the basic features needed to perform the experiment on students and try and answer the research questions. The more technical aspects of the design is outside the scope of this thesis, but some information can be found in the paper by Davidsson (2012).

Below follows a brief description of the main features and the operation of the application from a learner's point of view. As shown in the instructions section of the application (see picture XXX below), the main view shows a coordinate system with a white y- and x-axis on a grey background. On the right edge of this main field a black dot is connected with a green line that paints out the motion of the users finger. When the application is running the entire graph moves to the left at a constant speed. Touching or releasing the screen this apparent motion is turned on or off. At the same time a read line shows the corresponding velocity in the y-direction of the finger touching the screen. This red line, the velocity-time graph can be turned on or off as an option from the top-panel inside the application. This feature is used by the students in the training session where eg they first perform a motion and then are asked to draw the corresponding velocity-time graph. The “answer” can then be displayed by turning the red (velocity) graph on.

The rest of the main features of the application utilized in this study are shown in picture XXX below. The user can turn the grid on or off providing a help for the eye eg when comparing the two kinds of graphs. The user can turn on and off the the velocity-time graph, and of course the scene can be reset, emptying the screen of any graphical representations.

As pointed out in Anastopoulou et al. (2011), the velocity graph corresponding to a certain real-life motion can easily look spiky and full of fine details, due to the non “perfect” nature of any movement. Another factor adding to this is that the interface or apparatus capturing the motion is not perfect in its measurements. These fine detailed (see pic XXX) features are not essential when it comes to studying the main features of a movement and the corresponding graphs. Especially when it comes to making connections between a distance-time graph and a velocity-time graph, these details could easily distract the user from looking at the main features of the motions graphical representations. At the same time the amount of information will also be much too large for a user to process in a real time generation and interpretation of a graph. To overcome this problem a “quantization” of the velocity was implemented where the velocity-time graph value displayed was changed only if the actual calculation of the velocity from the finger motion changed by an amount larger than a certain value, effectively suppressing the fine grained structure of the “raw” data calculation. This feature can be turned on or off by the user. The student was urged to use this feature on several of the training tasks during the trials.

Human Computer Interactions (se smallab Learning)...

Test design

To make comparisons with the study by Anastopoulou et al. (2011) possible, the testing of the application was done in an as similar manner as the circumstances allowed. The quasi-experimental test was performed on 26 high-school students  of ages 17-18, divided into two sub-sets of 13 pupils each. One subset got to use the application. They constitute the “doer” group. The other subset got a lesson in a more formal setting where the author used and showed the application. These students constitues the “watcher” group. All test persons were students at a high school science program. At the time of the trials they were on the last semester of their high school studies, with the summer holiday around the corner. The students were tested in groups, in a total of five sessions, three doer and two watcher sessions. This is perhaps the most significant difference in the design of this study compared to the study of Anastopoulou et al. (2011) where each student was tested individually. This study thus involves the testing of an application designed to be used, also in a more informal setting with none or little assistance from an educator. This difference will be discussed in the closing chapter of this thesis. Each of the sessions lasted around one hour and 15 minutes. During the sessions a colleague, from now on denoted test leader number two assisted in blinding the results from test leader number one (the author) who would later score the tests. Both test leaders made observations related to hypothesis number three during the post test. The instruction was to make a make a note for the respective student every time that student moved their hand in such a way that it could be interpreted as if they were visualizing using the application although they did not have access to the iPad. Being two observers made it less likely that such movements from any of the 5-7 test subjects would pass unseen.

In short, each session started with a pre test to gauge the students pre knowledge on the topic. Then they went through a training session where they individually got to use (doers) or see test leader number one use the application in the watchers sessions. During the training they  answered a number of questions on a form. Then followed a post test where both groups got to answer a number of questions on their own, with no aids but pen and paper. I will now describe the sessions in more detail as they took place according to a prepared schedule. 

The doer sessions

Below follows a timeline of the doer sessions.

  1. The students were placed in a slightly U-shaped form before the two test leaders
  2. Test leader number two assigned a random number for each student kept secret to test leader number one
  3. Test leader number one introduced the students to the test
  4. Test leader number one briefly showed the application including the main features and buttons that the learner should be able to control
  5. Test leader number two handed out the pre test (see questions 1-4 in appendix no. 1)
  6. The students individually finished the pre test in around 10 minutes with some variation
  7. When a test subjects was ready to continue they were instructed to raise their hand and were then handed an iPad with the application installed and already opened
  8. Each student were to answers a number of assignments as training (see questions 5-11 in appendix no. 1) with the help of the application
  9. When each student was ready with the training session, they were asked to wait until the rest of the group were finished (generally step 8 took 30-40 minutes)
  10. They were also asked to review the questions again, and to try the application some more
  11. When all test subjects were ready, the sheets of papers containing the pretest and training questions were collected
  12. The posttest was handed out
  13. During the 20-30 minutes posttest the two test leaders observed the students independently taking notes to collect data to test the third hypothesis on a score-sheet
  14. When the students were ready, if they had time left, they were prompted to look at their answers again
  15. When time was out, the tests were collected and the identity of the test person for each test sheet was kept safe using the randomly assigned number instead of the name of the student on the test
  16. The list of random numbers paired with the names of the students was put in a sealed envelop, and so were the tests and score-sheets

One interesting observation made during the doer sessions was that the time the students actively interacted with the application versus the time they spent writing down the answers to the questions (or doing something else) varied quite a lot. However, there was no rigorous quantitative data collected on this matter.

The watcher sessions

Below follows a timeline of the watcher sessions. The first part of the watcher sessions was identical to that of the doer sessions.

  1. The students were placed in a slightly U-shaped form before the two test leaders
  2. Test leader number two assigned a random number for each student kept secret to test leader number one
  3. Test leader number one introduced the students to the test
  4. Test leader number one briefly showed the application including the main features buttons that the learner would later see as shown by test leader one
  5. Test leader number two handed out the pre test (see questions 1-4 in appendix no. 1)
  6. The students individually finished the pre test in around 10 minutes with some variation
  7. When the students were ready with the pretest, test person one turned on a document camera on
  8. The application running on an iPad was shown on a projecting screen filmed with the document camera
  9. The training session began with test person one reading each of the questions in order, giving time for reflection when so was asked for
  10. Test leader one further guided the students towards the answer using the application as shown on the movie screen
  11. When all training questions were finished, the first worksheet of questions was collected
  12. The posttest was handed out, processed by the test persons and later collected using the same procedure as in the doer sessions (steps 12-16)

In all, roughly the same time was spent learning before the posttest for both doers and watchers. Now follows a few notes on the training session (step 9-10 above): The graphs and the finger motions producing them performed by test leader number one were often repeated several times. By doing this, the test persons would get the same chance of seeing the connection between the actual movement and the graphical representations as was the case for the doers. The difference was that the watchers did not have the ability to personally interact with the graphs, manipulating them with their own hands. On the other hand, the watchers had the chance to ask questions, and they were guaranteed to get the correct answers for the training-sheet questions presented to them. They also got an explanation of the graphs and the correlations between them, as well as the connection to the actual motion that produced them. This latter connection is hypothesised to be made more clear for the doers since they themselves actually perform the motions generating the graphs. Apart from these hypothesised advantages, the efficiency and quality of the watchers session was intended to be as good as possible, so no other (obvious) factors could account for a potential observed (positive) effect thus seemingly confirming the hypotheses above.

Results

Here, the results of the students pre and post test will be shown. The test scores are presented in terms of the median values for each group, the normal practice for ordinal data. Ordinal data in contrast to interval data points are not suitable for a direct analysis eg in terms of statistical standard deviations. To find out whether one group performs better or worse compared to another group one need to apply a different method, as described in the analysis section. The number of test subjects in each category is shown under the column header (n). For the post tests the median values for the total scores of questions 1-5 and 6-9 are shown. These two groups of questions are related to the two categories - descriptive and analytical - of interest in hypotheses 1-3.  

Pre intervention test results

Group

(n)

Median value (max. 19)

Doers

13

18

Watchers

13

17

Gesture group

7

18

No gesture group

19

18

Male

11

18

Female

15

17

Post intervention test results

Group

(n)

Median Q 1-5

Median Q 6-9

Doers

13

9

16

Watchers

13

8

13

Gesture group

7

9

16

No gesture group

19

8

13

Male

11

9

16

Female

15

8

11

Below the raw data results are presented in bar-chart format, comparing the post-test results for for the doers vs watchers and female vs male test subjects.

Analysis of results

To analyse this type of quantitative data, it is hard to analyse if the differences between the mean or median values of the two samples are actually significant using measures like standard deviations directly. This is due to the ordinal nature of the data as compared to the interval type of data, eg the distribution of measured masses for a particle in a collider experiment. For this reason the more suitable Mann-Whitney U-test (M-W test for short) was used. In short, this test compares two samples of independent data, and assess if one set of data tends to have larger values than the other. The test scores are first translated into ranking scores, the lowest test score getting the lowest rank. The 26 test results are hence ranked from 1-26. The M-W test gives you the number of standard deviations differing between the rankings of the samples in terms of a standard score, Z. A Z-value of (more than) 1.65, the so called critical value in a one-directional test, indicates a rejection of the null hypothesis, leaving less than a 5% probability  that the observed result is due to chance.

When correcting tasks of this sort, there is always some room for subjective interpretation. To control if this could change the outcome of the study, a second round of correcting the test with less generous interpretations of the students answers was performed, the results of this exercise did not change any of the final conclusions.

Below I will first analyse the raw pre-intervention test scores for the different groups. Then follows the analysis of the post-intervention test data for hypotheses 1-3 using the Mann-Whitney U-test. Looking at each hypothesis on its own. The observed difference between gender will be used to try to control for this factor by means of a correction for the gender difference in a re-analysis of hypothesis 1-2. This in an attempt to separate the influence of gender versus using the application on the final score of the post text.

The fact that not only one hypothesis has been analyzed will be discussed in the last part of the analysis section. If you have multiple (independent) hypotheses in a test like this, there is always the chance that one of them will be accepted (passing the  limit) simply due to chance. This will be dealth with by demanding probability of any one of the hypotheses being (wrongly) accepted  should be lower than 5% and what this would imply for the acceptance significance levels of the individual hypothesis.

Pre intervention test scores

As can be seen in table (XXX) above, the median values for the pre intervention test was close to the maximum number of points for all groups. This tells us that it is hard to say if there are any major differences in knowledge between the different groups on the outset of the experiment. Performing a M-W test on the different pairs of groups show no significant difference between the doer and watcher groups ( or between the gesture or no gesture groups (. The significance between gender is slightly larger

( perhaps suggesting that the male group enter the experiment with slightly higher knowledge on the subject although this interpretation is not verified statistically.

The gender hypothesis

As can be seen above, the largest difference between any two identified groups of test subjects was that for the post-test scores comparing male and female test subjects. The median value for the 11 males on the posttest questions 6-9 was 16, compared to the median value of 11 for the 15 females. In contrast, the median value of the pretest (or for questions 1-5) only differed by one point. Performing a M-W test on the posttest questions 6-9 for these two groups gives a result for the standard score of, Z = 2.18 (see table XXX below), thus passing the critical value, reaching the  limit of confidence for a two-tailed rejection of the null hypothesis[2]. The same test performed on questions 1-5 did not show any statistically significant difference. For hypothesis 1-2 this fact will be controlled for. The individual scores of the post test will then be measured in terms of the deviation from the mean value of the respective gender. Thereafter the M-W test is performed on this new dataset. This procedure is an attempt to single out the difference between doers and watchers assuming that the observed difference between the genders is a truly generalisable observation. Not making the correction under this assumption, the difference between the doers and the watchers could be explained by the fact that the proportion of boys and girls in the doer versus watcher groups was not equal. The doer group had a higher proportion of boys compared to the watcher group. A second method to control for the difference in gender in the two groups was also used to check for consistency and validity. Using this second method (I call it the 4x4 method), the 4 male watchers were grouped together with 4 randomly selected female watchers and compared to 4 randomly selected male and 4 female doers. This ensures that the proportion of male and female test subjects was the same in the two groups. This randomization procedure was performed 5 times followed by a M-W test on each of these sets of 16 test subjects. Due to the smaller statistics in each of these M-W tests a smaller value for the standard Z score was expected. However, it should still point in the same direction as the gender correction using the first method. This was indeed the case, with a mean value of the standard Z score of 1.81. A slightly smaller number compared to 2.18 using the full sample. This is not surprising but could also point towards the application having some influence as a covariant variable. Therefore, in the same manner as just described, one could also look at the five sets of randomly selected subsamples comparing the 4+4 doers with the 4+4 watchers. In these sets of test persons there are as many doers as watchers, and as many males as females in the doers and watchers group. Thus if the effect was infact due to the application being used, this should be hinted in a M-W test of these five sub-sets of data. The results of this comparison will be shown below.

The doers versus watchers hypotheses

In Anastopoulou et al. (2011) the by far largest difference in posttest scores between the doers and watchers were found for question 6-9 gauging the students ability to see and understanding the correlations between the d-t and v-t graphs. A M-W test performed on the sum of scores on questions 1-5 should give a non significant difference between the doers and the watchers. It should be noted that the M-W test performed on a sub-set of questions compared to the whole sample is expected to be smaller due to the smaller possible spread of scores. The following table shows the standard scores comparing the doers and watcher groups using the M-W test on the two parts of the post test. It also shows the corresponding results for the two gender groups, as well as the corrected results for the observed gender difference. The median values can be seen in XXX TABLE... above.

Q: 1-5 Z-scores (-value)

Q: 6-9 Z-scores (-value)

Doers vs Watchers

1.28 (

1.10 (

Male vs Female (2-tailed)

1.27 (

2.18 (

Doers vs Watchers corrected for Gender

1.10 (

0.77 (

It is clear that the only significant difference between the standard scores for questions 1-5 or 6-9 are those between the male and female groups for questions 6-9. On the other hand, that difference is clearly significant (on its own) passing the critical value for Z (. The hypothesis that the increased score for the doers on the posttest would be mainly due to their better understanding of the connection between the d-t and v-t graphs can not be verifyed. This is in clear contrast with the results found in Anastopoulou et al. (2011). Just as the result for doer vs watchers was controlled for the influence of gender, gender could also be controlled for the influence of doer vs watchers. This was done using the 4x4 sampling method described in the last section. This method ensures that the same number of doers and watchers are assigned for each gender. The corresponding mean Z value for the five randomly assigned groups was 0.61 for questions 6-9, a value that points in the same direction as presented in table XXX above. The main reason for the observed difference between the doers and the watchers thus seem to be due to the fact that the doers group compared to the watchers group concisted of proportionally more male than female test persons.

Gesture versus no gesture hypothesis

In the discussion in Anastopoulou et al. (2011) it was pointed out that many of the students in the doer-group made gestures with their hands. Gestures that suggested to the researchers that they envisioned their hand moving in a manner that would produce the graphs and movements featured in the posttest. It was implied and suggested that this behaviour could serve as a cognitive link between the motion in question and the graphical representations as produced by the apparatus used.

Out of the 26 test subjects in this study, 7 were classified as exhibiting these kind of hand gestures. Out of these 7, 6 were male. Hence, the gesturegroup is expected to perform better than the rest of the group given that we already know that the male testsubjects performed better than the female ones. Or could it be that the causality here is reversed? Is it being more prone to gesturing that makes the male test subjects results superior? The standard score for the difference between the gesture group and the no-gestures group was  for questions 6-9. So, this result is much smaller compared to the standard score difference between the gender groups as a whole, where . For questions 1-5 the difference between the guesture vs no guesture group was . A further analysis in terms of a M-W test was made to compare the 6 male test subjects that performed gestures with the 5 male test subjects that did not. If the hypothesis under investigation was to be correct, the 6 male gesture-makers are expected to perform better (on questions 6-9) compared to those not making gestures. However, this was not found. In fact, the effect might be the opposite judging from the standard score of  The negative sign means that the opposite effect from that being hypothesised was found. So, it seems unlikely that being more prone to gesturing is what makes the male group perform better compared to the female group. It might also be worth noticing that 4 out of the 6 male students performing gestures were also in the doers group. This might even contribute to the smaller difference between doers and watchers compared to the study by Anastopolout et al. (2011).

I will now propose one possible explanation of this potential effect, if it was indeed confirmed in a new aalysis. It might be the case that the students not using their hands while taking the posttest are more likely to know the material and how to answer the questions well. Therefore they would not need the aid of gestures as a visual cognitive link to get the answers right. So, using your hands when trying to recall what you learned could rather be an implication that you are not really confident in your knowledge and ability to answer the questions. To control for this possible effect and find out if there really could be a benefit from making gestures during a test like this, one would have to make a more thorough test, where one could eg prohibit one group of students from using their hands in some way during the test, the hypothesis being that for this group, the mean score would go down.

A multi-hypothesis analysis

Given the fact that not one, but three hypotheses were evaluated using analytical statistical tools, to claim a discovery, or at least the confirmation on any one of them, the statistical significance acceptance level need to be adjusted. While it is still true that it is less than a 5% probability (that the observed difference in gender performance is due to chance, it is more than a 5%  probability that any one of the three hypotheses would be confirmed due to chance. Making three “guesses” gives you a larger chance of being lucky compared to making just one guess, to make an analogy. So what confidence levels should be used for the individual hypotheses such that the total probability of any one of the by chance should be less than 5%? The answer is  . So basically this means that neither the finding of gender difference (with a ) should be accepted as statistically significant, looking at this work as a whole. It could however be used as a hypothesis generating result, to be confirmed by more data.

Conclusions and discussion

In the analysis section of this thesis, some conclusions have undoubtedly already been made. In this section I will first summarize and expand on these and discuss possible further efforts. I will then continue to discuss some observations made during the tests and what interesting questions could be worthy of further study looking back at the work as a whole. I will also make suggestions on how to improve the method used in this work.

The main conclusion from this thesis (in terms of its statistical significance) would be that there seem to be a difference between the two genders when it comes to the understanding of the more complex questions of the relationships between the different types of graphical representations of motions studied. Male students seem to do better on these types of questions while the difference is smaller on the more straightforward tasks concerning the description of motion with graphs and vice versa - the interpretation of graphs in terms of real world motion. The result is not statistically significant in the grand scheme of this analysis due to the fact that several hypotheses were tested simultaneously as explained in the last section. Correcting for the difference of gender, the results of Anastopoulou et al. (2011) can not be reproduced with the same level of statistical significance as in that analysis even though the results of this study hardly contradict the above mentioned study. Thus, failing to reject the corresponding null-hypothesis, the conclusion that learners using their hands when learning using the application developed cannot be drawn from the data presented here. To settle the question, one would need a larger statistical sample, preferably with the same distribution among the genders in the different test groups. In this work, this was not of high priority and due to practial reasons was not realised. If this further analysis was to be done, some further improvement could be implemented making it more experimental as compared to the presented quasi-experimental setup. For example, the test should be double blinded. Due to practial reasons the person making the grading was present during the post test. One interesting aspect would be to look at the time the students spend using the application, if that has any bearing on their results. In this analysis it was observed that this time varied quite a lot.

Furthermore there seem to be no difference in the learning outcome from those students prone to using their hands in gestures when solving problems compared to those students who do not use their hands.

With regards to hypothesis 4 it seems clear that the developed application works well for learning in informal settings without the need of teachers guidance. There was only a very few instances where the learners needed any help during the training session. An application like this could thus be used in a “flipped classroom” kind of setting, where the learner can explore and learn about the quite complex concepts of graphical representations of motion on their own. The explotion of the number of tablets in schools and in the homes of students makes this a possible reality. A valid point of discussion is the fact that the learning session with the iPad was quite short. Furthermore it was not spontaneous from the learners side, and the learning opportunity was not placed at the logical time spot for the students. The risk for quite a low level of motivation from the students side is not negligible. One questions left for investigation relating to this observation would be to explore what kind of effect a longer usage would lead to. Putting the usage of the application in context, together with and as a complement to more formal learning: what would the result be of a more blended learning setting? Involving both this type of application as well as more traditional education? 

One question raised early in this work about the method used is if the introduction of a new “layer” (the finger marker on the screen) between the generation of motion and the visualization of the motion will weaken the link between the mover/movement and the graphical representation? Compared to the original study where the motion of the hand was measured “directly”. On the other hand, compared to the method of the original study, visually speaking, the distance graph is presented at the same physical place as the test subjects hand, thus potentially increasing the same type of cognitive linkage.

Student activity and motivation

As already mentioned above, the time spent using the application was observed to vary quite  substantially between the different doer-group test subjects. One reflection on this is that making the test with only one test person for each session, it might be easier to guide the learner to use the application more actively. With the same line of reasoning it might be easier to motivate the student into actually engaging in the task. The risk of the learners thoughts wandering somewhere else could also be smaller if performing the tests individually instead of in groups. It could also be useful to film the sessions in order to be able to calculate the amount of time spent study using the application. This could potentially explain some of the differences between this and the study of Anastopoulou et al. (2011) where the test subjects had a self selected interest in science.

Another observation that would call for improvement is that during the demonstration of the applications different functions, in the first doer session, it was not explicitly stated that it is the velocity in the up-down (y-) direction that is measured and graphed. In this session at least one of the students were seen making finger movements suggesting to the author that they did not yet realise this fact. That student was however made aware of the situation as the students finger movements thus changed. In the following doer sessions, as well as the watcher sessions this was made clear in the demonstration of the application.

Gender difference

Beichner (1990) found that male students performed better on tasks similar to those used in the posttest of this thesis. However, in the mentioned study the male students also performed significantly better on the pretest. In fact, it was stated that both male and female students gained the same amount of knowledge from the studied learning activity including the simultaneous presentation of motion and different graphical representations. Even though this study did not include all of the modalities activated in the learning process studied in this thesis, it is evidence that the understanding of graphical representations of motion has been observed to differ between the two genders. Perhaps, if making the pretest more elaborate and perhaps include more complex questions, the difference between the male and female test subjects in this study would be significant. As it stands, the pretest left little room to make this kind of differentiation. More general (but still relevant) differences between male and female physics  students achievements have been pointed out by eg Lorenzo, Crouch and Mazur (2006) and by Pendrill (2005). Mazur et al. (2006) also point out that the more interactive methods of learning can be used to decrease the gender differences.

Outlook and future efforts

There are several already stated ambitions when it comes to settling questions going into and resulting from this investigation. The questions of gender difference, the hypothesised  difference between the embodied vs more traditional learning, and the question of whether using gestures as a way to summon mental images of the studied material could all be solved expanding the types of investigations described in this thesis. At the moment of writing this, our department is is making connections with several schools to make possible more formalized larger scale studies. This would enable for a more structured approach with a higher level of control over variables such as gender. Also, the grades of students could be compared with their progress using the type of application developed here. One could envision a study in a blended learning setting where the effect of informal and formal learning could be compared, testing the students before, during and after an intervention. The test design in Birchfield and Johnson-Glenberg (2010) could be used, where one group of students gets instructions in a formal learning setting, then they get to learn using the more informal technology aided tools. Another group goes through the same steps and instructions, but in the reversed order. They are tested before in between and after the two learning stages. In a more formalised research setting, this learning process could be inserted at the proper time in the students curriculum.

Apart from the already stated ambitions, a wide line of enhancements and new areas for this type of application is possible using the same or other types of hardware. The usage of the new types of internet-connected touch based platforms opens up the possibility to add more features motivated by the theoretical assumptions discussed in this thesis. One could for example add the possibility for the learner to reproduce a graph on a larger physical scale. Letting the learner run or walk in accordance with a velocity-time graph. Using the built in GPS functionality of their phone or tablet, the speed and place of their location would then be recorded and illustrated. Once again involving the student and their bodies in the construction/reproduction of abstract representations. Similar functionality has been implemented by SmalLab Learning (2011). Also the accplication as implemented on a digital whiteboard, extending the bodily motion towards a larger scale could be looked into.

Another line of research and development to be implemented in this type of application would be a richer integration of  the concept of perceptual learning. As used in the experiments described in this thesis, the perceptual learning was accomplished by combining the students pre-test and training worksheets with the usage of the application. A typical learning scenario that could be implemented would be for the student to be presented with a v-t graph and be asked to reproduce the d-t graph using the application. This could be done over and over, multiple times where the student would get feedback in terms of the correct graph and the student generated graph superimposed. Furthermore the efforts could be scored and the students efforts could be stored making it possible to track the individuals progress. These are all features that has proved successful when it comes to learning outcome, as shown by eg Kellman (2009) or as implemented by Smallablearning (2011).

Adding sound into the multimodal experience of motion could also enhance the learning when using an application of this type. Mapping the finger velocity with the frequency (pich) of a sound being played in real time, and perhaps even the possibility to “record” and replay the motion could be one way to realise this expansion of multimodality. Another feature to be implemented in the application would be to add the vectorial representation of the velocity, as well as the description of two-dimensional motion. A vector - an arrow showing the direction and speed of the finger - would then be shown in real time at the tip of the users finger. This would enable the study of a wider range of multiple representations of motion, as well as a wider range of types of motion.

It would also be of interest to try and control for the possible influence of motivation. This could be done simply testing the application on University students in the field of science or technology.

References

SmalLab Learning, (2012-07-15). SmalLab Learning website. http://smallablearning.com/ [2012-07-15]

Anastopoulou, S. (2011). An evaluation of multimodal interactions with technology while learning science concepts. British Journal of Educational Technology Vol 42 No 2. ss. 266–290

Barab, S. (2007), Situationally embodied curriculum: Relating formalisms and contexts. Science Education, 91: 750–782. doi: 10.1002/sce.20217

Cox, R. (1999). Representation construction, externalised cognition and individual differences. Learning and instruction 9, ss. 343-363

Donovan M. S., Bransford J.D., (editors) (2005) How Students Learn: History, Mathematics, and Science in the Classroom. Whashington D.C., The national academies Press

Föreningen Datorn i Utbildningen (2011). Sverigekartan, http://www2.diu.se/framlar/egen-dator/ [2011-10-20]

Kellman, P.J. (2009). Perceptual Learning Modules in Mathematics: Enhancing Students’ Pattern Recognition, Structure Extraction, and Fluency. Topics in Cognitive Science, ss. 1–21

Kelly, A.M. (2011). Teaching Newton’s laws with the iPod Touch in conceptual physics. The Physics Teacher, 49(4), 202-205.

Linn M.C., Eylon B.-S. (2011). Science learning and instruction: Taking advantage of technology to promote knowledge integration. New York, NY: Routledge.

Papert, S. (1980). Mindstorms: children, computers, and powerful ideas. New York: Basic Books.

Texas Instruments (2011). Research on TI-Nspire™ Technology.

http://education.ti.com/educationportal/sites/US/nonProductMulti/research_nspire.html [2011-10-25]

Vernier. (2011). Video Physics for iOS. http://www.vernier.com/products/software/video-physics/ [2011-10-19]

Vogel, B. (2010), Integrating Mobile, Web and Sensory Technologies to Support Inquiry-Based Science Learning. The 6’th IEEE International Conference on Wireless, Mobile and Ubiquitous Technologies in Education. ss. 65-72

Skolverket. (2009). TIMSS Advanced 2008, Svenska gymnasieelevers kunskaper i avancerad matematik och fysik i ett internationellt perspektiv. Report. Available: http://www.skolverket.se/publikationer?id=2291 [2011-10-17]

Skolinspektionen. (2010). Kvalitetsgranskning. Fysik utan dragningskraft - En kvalitetsgranskning om lusten att lära fysik i grundskolan. Report. Stockholm: Skolinspektionen. (Skolinspektionens rapport 2010:8)

Davidsson, M. (2012). A Mobile Application With Embodied Multimodal Interactions for Understanding Representations of Motion in Physics. In I. A. Sánchez & P. Isaías (Eds.), Proceedings of the IADIS International Conference Mobile Learning 2012 (pp. 263-266). Berlin: IADIS Press.

Beichner, Robert J. (1990). The effect of simultaneous motion presentation and graph generation in a kinematics lab. Journal of Research in Science Teaching, Vol 27(8), Nov, 1990. pp. 803-815.

Lucia, M., & Cicero, L. (2010). THE USE OF MOTION SENSOR CAN LEAD THE STUDENTS TO UNDERSTANDING THE CARTESIAN GRAPH. In F. Viviane Durand-Guerrier, F. Sophie Soury-Lavergne & I. Ferdinando Arzarello (Eds.), Proceedings of the Sixth Congress of the European Society for Research in Mathematics Education (pp. 2106–2115). Service des publications, INRP.

Hostetter, A., Alibali, M. (2008). Visible embodiment: Gestures as simulated action. Psychonomic Bulletin & Review. New York. Springer.

Charoenying, T., Gaysinsky, A.(2012) The Choreography of Conceptual Development in  Computer Supported Instructional Environments. ACM International Conference Proceeding Series; 9/ 1/2012, p162-167

The Graphing Skills of Students in Mathematics and Science Education. ERIC Digest

Ainsworth, S. (2008) The Educational Value of Multiple-representations when Learning Complex Scientific Concepts. In Gilbert, J., Reiner, M., Nakhleh, M., (Eds.), Visualization: Theory and Practice in Science Education. Netherlands. Springer.

Kaput, J. J. (1989). Linking representations in the symbol systems of algebra. In S. Wagner &

C. Kieran (Eds.), Research issues in the learning and teaching of algebra (pp. 167–194). Hillsdale, NJ: LEA.

DeFT: A conceptual framework for considering learning with multiple representations

Rose, C. (1985). Accelerated Learning. New York: Dell

ftp://195.214.211.1/books/DVD-032/Rose_C._Accelerated_Learning_(1985)(en)(190s).pdf

Lorenzo, M., Crouch, C-H., Mazur, E., (2006). Reducing the gender gap in the physics classroom. Am. J. Phys., 74, (pp. 118-122).

Pendrill, A-M (2005). University and Engineering Physics Students' Understanding of Force and Acceleration - Can Amusement Park Physics Help?, Contribution to Physics Teaching in Engineering Education (PTEE) 2005, Brno.

SmalLab Learning (2011). Velocity Match 4 - The Games. http://youtu.be/3nj_rK9mrbo. [2012-08-22]

Birchfield, D., & Johnson-Glenberg, M. (2010). A Next Gen Interface for Embodied Learning: SMALLab and the Geological Layer Cake. International Journal of Gaming and Computer-Mediated Simulations (IJGCMS), 2(1), 49-58.

 


[1] As compared to using their hands in gestures while doing the post-test.

[2] The null hypothesis being that there would be no difference between the two genders. If the hypothesis would have been that male students outperform female students, the critical value for the corresponding one-tailed rejection of the null hypothesis would have been more clearly rejected.