Paginated PDF Version
Mark Evan Nelson
In language and literacy education, it is becoming a pervasive sentiment that what it means to mean in the current semiotic climate is something different from what had hitherto been understood. Widespread use of multimedia technologies (mobile telephones, IP video conferencing, and Internet-enabled gaming, to name a very few) is fundamentally altering human communication, as Marshall McLuhan (1964) famously portended. Undeniably, in the bright light of such radical shifts in the forms and functions of language and literacy practices, it is evident that equally radical adjustments are called for in the domain of language and literacy education as well; and what is primarily required is to look both within and without the domain of language and broadly conceptualize how meaning is made in and across these new forms of electronically mediated, highly visual communicative practice.
Moreover, owing to this increased (and increasing) heterogeneity and hybridity of common contemporary texts (e.g. Web pages, PowerPointTM presentations, etc.), a central concern of any new workable theory of electronically mediated meaning must be to understand the implications of multimodality, i.e. integration or orchestration (Kress & van Leeuwen, 2001) or braiding (Mitchell, 2004) of elements of a variety of semiotic modes1 such as imagery, written text, sound, and music within a single text. Though the complexity of such a project necessitates extensive investigation from a number of perspectives, my present concern is with understanding the ways in which drawing upon various semiotic modes in multimodal composition may interact with and shape the expressive power and voice (Bakhtin, 1981) of the author in the new media age2.
As a point of departure, I take Gunther Kress’s (2003) assertion that a theory of multimodal meaning-making must account for the complementary processes of transformation and transduction, which he explains as the purposive reshaping of semiotic resources within and across modes, respectively. These processes, Kress also points out, are together the engine that drives the psychological machinery of synaesthesia3, the emergent creation of qualitatively new forms of meaning as a result of "shifting" ideas across semiotic modes (p. 36). Drawing substantially upon the important conceptual work of Kress, as well as others, the main aim of this paper is to illuminate the nature and workings of synaesthesia within one particular, increasingly popular species of multimodal communication, digital storytelling.4 Referencing analysis of data drawn from case studies of the composition processes of English language learner (ELL) writing students in a university undergraduate course entitled "Multimedia Writing," I aim (a) to demonstrate practical evidence of the synaesthetic functions of transformation and transduction at work in the multimodal text creation process, (b) to specifically show how the synaesthetic functions of transformation and transduction can actually serve to both facilitate and hinder authorial voice, understood as the purposive expression of personal meaning, in consequential ways, and (c) to point out some possible implications of synaesthesia and multimodal communication for L2 authors in particular.
Authorship and Technology
The status of the author in the postmodern world is an uncertain one. Though it seems self-evident that each of us has our own meanings to express, there are those, most notably French semiotician Roland Barthes, who take the notion of authorship to be a mere chimera. Barthes (1977) famously argued that if there ever were such a thing as an author, s/he is presumed "dead." Barthes writes, "a text is not a line of words releasing a single theological meaning (the 'message' of the Author-God) but a multidimensional space in which a variety of writings, none of them original, blend and clash" (p. 146). The defining characteristic of Barthes’ (nonexistent) author is the ability to conceive and express original ideas. He does not say that we cannot write, simply that we cannot write anything that is unique to ourselves. This resonates, in large part, with Bakhtin’s (1981) theory of speech genres: stable discourse types, the words of others, which we cannot help but draw upon for our own communication purposes. In an important sense, Barthes may be right. There may well be no new stories, as the saying goes; we do speak and write in re-combinations of the words and ideas of others. However, I believe that we can authentically redesign these or, as Bakhtin optimistically explains, "populate" the utterances of others with our "own intentions" (p. 293).
Still, this is no mean task, particularly in this age of increasingly diverse, proliferative, and, most significantly, combinative communications media. What is required for authorship is to know the properties of the technologies and technological environments with which and in which we interact (Murray, 1997, p. 64). This is so because the potential for authorship, authentic expression of an authorial voice, lies not so much in the words, images, sounds, etc., that we employ, but rather in between them, in the designing of relations of meaning that bind semiotic modes together. Accordingly, I submit that an understanding of multimodality, as it relates to technologically mediated communication, is a necessary starting point.
I next explicate the notions of multimodality and synaesthesia in somewhat greater depth, since a grasp of these concepts in the specific senses in which they are meant here is crucial to the development of the arguments to follow.
Multimodality and Synaesthesia
In practically every field of inquiry, the past several decades have seen the advent of what W.J.T. Mitchell (1994) calls "the pictorial turn": a philosophical attention to imagination, imagery, and non-linguistic symbol systems and a setting aside of the “assumption that language is paradigmatic for meaning” (p. 12). However, this shift in focus is not only an ivory-tower intellectual trend; rather, it is symptomatic of the growing salience of non-linguistic forms of communication, especially visual/pictorial forms, in the lives of us all. This is perhaps best exemplified by the ubiquity of the Internet and other multimedia technologies which integrate imagery, voice, sound, written text, and other semiotic modes in ways that traditional print media, for example, do not. In this limited sense, textual communication has never been more easily or readily multimodal than it is now (cf. Kress & van Leeuwen, 1996, 2001; Kress, 2003).
Yet, human communication itself cannot be said to have become any more visual or multimodal than it hitherto has been. Gaze, gesture, etc., have always been indispensable features of even the most ostensibly linguistic of interactions (cf. Kendon, 1991; Lanham, 1993; Finnegan, 2002; Kress, 2003). Anthropologist Ruth Finnegan (2002), in fact, categorically defines human communication as a necessary coordination of "their powers of eye and ear and movement, their embodied interactions in and with the external environment, their capacities to interconnect along auditory, visual, tactile and perhaps olfactory modalities, and their ability to create and manipulate objects in the world" (p. 243).
Finnegan’s point notwithstanding, the multimedia texts with which we are now confronted, and, crucially, which we are all increasingly likely required to create (Stephens, 1998), require a different kind of sense-making than has been previously seen. Bolter and Grusin (2000), in their theory of remediation, assert that each new form of communications media renders opaque the transparency of its forbears, which is to say that new media call attention to the representational inadequacies of older media, which we have come to take for granted. Accepting this, what do we do when various media, old and new, are commingled? How do we assess the representational utility of one medium when it is really a hybrid of several others? What is required is an understanding of multimodal design, in line with the work of The New London Group (1996) and related scholarship (e.g. Cope & Kalantzis, 2000; Kress & van Leeuwen, 2001; Gee, 2003; Kress, 2003).
As Kress (2003) explains, different modes have different organizing logics, and, as such, different affordances (Gibson, 1979) for meaning-making. For instance, ideas encoded in imagery may be said to offer a different, more spatial and simultaneously apprehended kind of meaning than the same ideas5 encoded in oral language, which presents ideas in a sequentially and temporally organized way. Moreover, Kress goes on to explain, the logics and affordances of different modes also necessarily entail certain “epistemological commitments” on the part of the user:
Reflecting upon Kress’s illustration, one might feel that the epistemological peculiarities of each mode, and the weltanschauung structured according to its meaning-making affordances, may seem academic, i.e. without practical ramification. However, the consequences of mode for meaning can indeed be concrete. Tufte (1997) offers one powerful illustration of such a case. He shows how the Challenger space shuttle disaster might have been averted had the involved scientists better understood relations between their intended meanings and their chosen graphic presentation materials when discussing faulty o-rings and the possible danger of explosion (pp. 27-54).
So if, in fact, the quality of meaning of a sign is somehow ineluctably bound to the semiotic mode in which it is made manifest, how is it that coherent meaning is made in a text that is constituted by elements of different modalities that entail respectively different organizing logics and epistemological commitments? This is a question that cuts to the core of the notion of synaesthesia.
According to its common, clinical definition, synaesthesia is a condition whereby a person experiences one sensation, e.g. smelling a scent or seeing a color, in regular correspondence with a seemingly unrelated sensation. Moreover, these experiences are physical and real. In his well-known book The Man Who Tasted Shapes, neurologist Richard Cytowic (1993) presents the case of "Michael" who, upon tasting a chicken dish, complained that the chicken did not have "enough points," indicating that the flavor would have been better if it had been pricklier (pp. 3-6).
While it is amusing to wonder whether each of us may have the capacity to hear a ham sandwich, it is necessary to make clear that the definition of synaesthesia I trade on here is qualitatively different than that described above. It is an understanding of synaesthesia that is truer to the Greek etymology of the word, the two main meaning-bearing segments of which, syn and aesthes, originally meant "along with" and "sensation," respectively. The scientific definition of synaesthesia denotes a replacement of one sensation with another or a serendipitous co-occurrence of two sensations; however, as I understand it Kress’s (2003) semiotic adaptation of the term, described above, refers to a process of emergence, where meanings presented in two or more co-present semiotic modes, e.g. the visual/pictorial and oral/linguistic, combine in such a way that new forms of meaning may obtain, in the (loosely) gestalt sense of a whole that is irreducible to and represents more than the sum of its parts. Further, Kress’s theoretical formulation identifies the co-operation of transformation, which "operates on the forms and structures within a mode," and transduction, which "accounts for the shift of 'semiotic material'…across modes" (p. 36) as the mechanism of the emergence of synaesthetic meaning. As such, semiotic synaesthesia must be understood not as a purely perceptual phenomenon, but a phenomenon jointly governed by processes of sensing and sense-making. The crucial ramification of this is that the emergence of synaesthetic meaning can occur even when the semiotic elements involved are no longer co-present. Memory plays an important role in these interactions. So, I want to emphasize and hereafter illustrate that the true emergent quality of synaesthesia obtains not so much in multimodal texts themselves as in the act of authoring them.
So understood, synaesthesia can be the process and locus of "much of what we regard as creativity" (Kress, 2003, p.36) in multimodal communicative practice. Importantly, though, creativity does not automatically, efficaciously obtain in multimodal text creation; as follows, I also illustrate the hindrances to creativity that may be attributable to synaesthesia in the digital story creation process.
Multimedia Writing (hereafter referred to as MW) was the title of an experimental credit-bearing course offered at the University of California, Berkeley. This course, which I designed and implemented, provided the context within which to conduct the case studies described below. The questions that guided my research are as follows:
The semiotic awareness mentioned above is pivotal. Just as with language use, where focal awareness and peripheral attention both play key roles (van Lier, 1995), the ability to look, in Lanham’s (1993) words, both “at” and “through” media is vital to decoding and designing multimodal meaning.
As focal authors, I chose five UC Berkeley students who identified themselves as L2 English users and enrolled in ELL sections of Berkeley’s freshman composition course. The other criteria upon which the selection of these students was based were that they scored at or below a threshold score of five (out of eight) on the institutional diagnostic essay evaluation and received less than 500 points on the verbal portion of the SAT. Admittedly, too, since participation in the MW course was entirely voluntary, the selection of informants was also naturally limited to those who wanted the credit and extra language practice and wished to participate.
These five informants represented four cultural/linguistic groups: two Hmong (Laotian) students, one Taiwanese native speaker of Mandarin, one Korean student, and one native speaker of Cantonese. Their time in the United States, as of the beginning of the term, ranged from three months on the part of the Korean student to eighteen years, her entire life, on the part of the Cantonese speaker. This and other pertinent information is summarized in Table 1 below:
For purposes of this paper, I refer principally to data collected from the digital story creation processes of Ally, Bonnie, Carrie, and Emma6 as these students’ work best exemplified the themes I develop here.
I began my research with the aim of investigating the potential efficacy of involving language learners in the process of digital storytelling, the creation of multimedia narratives which, via computer programs such as Adobe Photoshop and Adobe Premiere, integrate text, imagery (still and video), and sound (voice and music). Each student undertook the project of conceiving, designing and constructing an original digital essay over an eleven-week period (which became seventeen weeks) in the step-wise, draft-oriented manner of "process writing" (cf. NCTE Commission on Composition, 1984). These essays related directly to topics the students were writing on in their composition classes, and their assignment was to develop a short piece (three to five pages, in the written language mode) on a topic relating to language, culture, and identity, which were the dominant themes of their composition course readings and discussions. Titles and brief descriptions of the multimodal narratives produced by each of the four participants discussed here are as follows:
The class started with some theoretical orientation toward the nature and implications of semiotics and multimodal communication, drawing on the work of Gunther Kress, C.S. Peirce, and others. The purpose of this was to give the participants a framework within which to reflect upon and discuss their own respective creative processes. However, as my own research objective was to discern what kinds of semiotic effects may be organically attendant to the process of multimodal design, I did not emphasize theory. Rather, I preferred students to reflect throughout the creative process on the meanings they wanted to express in the work and how well they felt those meanings were communicated by the pictures, words, etc., (in isolation and combination) that constituted their individual essay "movies." The only real model the students were shown was an amateur digital poem produced by a local youth, and this served mainly only to demonstrate the ways images and language might be coordinated on a timeline. (See Hull & Nelson (2005) for a detailed explanation and analysis of this poem.) In order to avoid leading students in any particular direction, I was careful to frame the course and project for students as an opportunity to work intensively on their English writing while exploring, thinking, and speaking broadly about expressing ideas in different modal forms. For my part, I did not evaluate students’ writing according to any particular rubric. We did work together on adequately supporting and explaining ideas in writing, grammatical accuracy, etc., as needed; however, the primary focus was on trying to make an honest personal statement.
Of course, even in a non-graded, informal type of course such as MW was, conflicts of interest may arise when participants are presented with the task of making an honest, personal statement in the classroom context. In terms of intended audience, each informant reported creating the essays for themselves, their friends and family, me, and an unknown wider audience. Naturally, students were fully apprised of the research agenda I was pursuing, and they knew that with their permission I might want to show their work in an academic context at some point7. The analysis that follows should be considered alongside these limitations.
The approach to the research aspects of the project was situated roughly within the theoretical tradition of "design experiments," to the extent that the students and I collaboratively engaged in technology-oriented practices for purposes of accomplishing real-world goals (i.e. improvement of writing and communication skills, artistic creation) as well as coming to a better understanding of the nature of communication itself (cf. di Sessa, 1991; Brown, 1992; Collins, 1992). Over the course of the program, I continuously gathered a variety of data including students’ written journals, recorded in-class interactions, three periodic interviews with each participant, and the weekly collection of all digital essay-related artifacts, such as written essay drafts, image collections, and in-progress multimodal composition drafts.
These last two forms of data, the interviews and artifacts, were of particular value in developing the analytic categories that I next discuss. The interviews, conducted at the pre-writing/pre-visualization, rough construction, and completion phases of the creative process,8 provided the most reliable means at my disposal of understanding the thought processes that shaped each participants project-related decisions from week to week, and analysis of the artifacts themselves, the concrete traces of these decisions, helped to confirm or disconfirm the veracity of suppositions made on the basis of the interviews and other data.
Iterative coding and analysis of the data set offered up several categories of potential facilitators and hindrances to multimodal authorship. However, it should be understood that these categories are fluid. There is overlap between them. Also, the categorization of these as facilitators and hindrances is meant as a heuristic device and, as such, cannot not sufficiently capture all of the possible subtleties of interpretation.
This section of the paper presents evidence of several types of communicative benefit that I observed to attend the multimedia authoring processes of participants in my study. I have termed these: Resemiotization through Repetition, Recognition of Language Topology, and Amplification of Authorship.
Resemiotization through Repetition
A salient feature of all of the essays created by students in MW was the repetition of certain images throughout each piece. In my experience, in the process of orchestrating recorded spoken language with imagery, music, etc., authors almost inevitably repeat certain images throughout the duration of the piece. Certainly, there may be purely practical reasons for this repetition, such as the inability to think of or find a more suitable image. Yet, my data suggest a particular efficacy to this repetition, although it may not have been the product of the conscious intention of the author at the time of her composing. Post-production interviews with two authors, Bonnie and Carrie, evidenced an emergent awareness on their respective parts of a deeper, more complex, more abstract quality of meaning that developed within the image-word sign in a multimedia composition as it progressed. Below are excerpts from the aforementioned interviews, in which Bonnie and Carrie each respond to the question of whether the meaning of an image changes as it is repeated in the essay, and, if so, how (Here and hereafter, please pay particular attention to the areas in bold):
The thematic commonality that these excerpts share is a recognition on the part of the author that additional meaning can accumulate within the same image as it is repeated due to the defining influence of what is said, shown, etc., "in between," to use Carrie’s words, different instantiations of the image as it is presented in the digital story. Iedema (2003), as a clarification to the concept of multimodality proffered by Kress, calls this kind of alteration in multimodal meaning "resemiotization."9 Specifically, Iedema argues that popular views of multimodal communication insufficiently account for the social and historical processes that bear on how conjunctions of image, language, and other modes find their way into conventional existence. In other words, the meaning of a multimodal text at any given moment is necessarily shaped by the meanings that are imputed to its component semiotic parts over time, parts that have semiotic histories of their own.
I suggest that in the excerpts above Bonnie and Carrie are describing the process of resemiotization as it operates in miniature, so to speak. In other words, they came to realize how it is that multimodal meaning can shift diachronically over the history of the six-minute duration of a multimedia narrative. [Please see the accompanying clip from Carrie’s narrative.] In Peircian terms, in reference to the example of Carrie, the image of her parents graduates from iconic to symbolic significance. In the first sixty-nine seconds of Carrie’s piece, the image of her parents (Figure 1) is repeated three times, shown the first two times in conjunction with the spoken word "parents" and with the word "motivation" in the third. Ultimately, parents are no longer just parents in this image; the photograph is resemiotized to become a symbolic expression of motivation itself. (Peirce,  1955)
There is an important implication of this finding: while it is true that such a shift in significance could occur within the context of a less multimodal text, such as a traditional written text, it would be relatively invisible. Whereas words may take on symbolic meaning within a written text, this symbolic meaning is often difficult to discern. Resemiotization of the multimodal variety illustrated above is comparatively more salient and readily apparent to author and audience alike, in that new meaning resides not simply in the word itself, but in the synaesthetic relationship between the image and the word. Logically, this greater salience should offer the benefits of a higher probability of heightened semiotic awareness and, following this, the affordance of new possibilities for powerful, intentional meaning-making. Multimodal composition gave these two authors the opportunity to recognize or notice a new, more sophisticated form of semiotic relationship, which may have particularly positive implications for ELLs, as is discussed in the implications section.
Recognition of Language Topology
A second form of facilitation of meaning-making that may inhere in multimodal communicative practice is what I call recognition of language topology. I use topology here in the limited, linguistic sense that Lemke (1998) intends in making the theoretical distinction between what language says (relations of categorical distinction) and what it looks like (its layout, font, color, etc.). Lemke refers to these distinct dimensions of meaning as typology and topology, respectively. I count this kind of understanding among the potential benefits of multimodal communication because it helps a multimodal author to engineer a multimedia piece for optimal expressive impact. Knowledge of the semiotic affordances and implications of what a written text encodes linguistically and visually, and of the ability to design complementary relations of meaning among these modes, represents a potent communication combination, indeed.
In the following excerpt from a Week 7 interview, Ally discusses her thoughts on the topological significance of her written language representations of Chinese language and culture:
As we see in Ally’s first and second turns above, she has decided to use an assemblage of Chinese characters to represent "Cantonese" (Figure 3) with no regard for what the characters mean (in the typological sense), but she has also chosen to represent Chinese culture with the English word "Chinese" (her third turn). At first blush, these choices may seem to evince a hindrance to Ally’s expression of the deeply personal connection she feels to her Chinese cultural Self; she went on to explain that she made both of these decisions based upon her perceptions of the directness of communication that such choices would facilitate with a particular non-Chinese audience in mind, an audience for whom Chinese characters are a symbol of Chinese-ness rather than language per se. She speaks to an audience who understands what is Chinese only when it is represented in English language terms.
However, fascinatingly, the seeming concession that Ally has made to her audience-Other, may be no compromise of her authorial voice at all. These choices actually serve to reinforce the thesis of her piece, which is that the stereotypical images of different cultures that many bear in mind, and Ally counts herself among them, though seemingly self-evident and innocent, belie deeper, truer understandings. Moreover, she has effectively layered personal, topological meaning onto the typology of the English word “Chinese” by utilizing color symbolism (her association of Chinese-ness with the color red) and type style (boldness to metaphorically indicate strength). Tapping into the meaning-making potential of both the typological and topological affordances for meaning-making in language, Ally has uniquely designed and accomplished a multilayered, richly textured, synaesthetic statement of her main idea.
We cannot know for certain how mindful Ally was of the sophisticated composition she was creating. While she indicated in the interview that she was not consciously aware of these choices, it could perhaps have been that she was simply unable to verbalize this understanding. However, authorship implies intentionality, so, can we say that Ally is expressing an agentive voice if she is unaware that she is doing so? Strictly speaking, we cannot. However, the fact remains in either case that multimedia writing ultimately afforded her a new, alternative means by which to experiment with and learn about structuring relations of meaning between modes, which may well be supportive of powerful expression in the future. One caveat before I go on: notwithstanding Ally’s successful appropriation of conventional representations in her piece, the role of audience and the threat to authorship posed by stereotyped images are consequential issues, a further treatment of which is to follow.
Amplification of Authorship
This third type of benefit, Amplification of Authorship, describes instances in which participants’ multimedia essays came to evince a deeper, fuller quality of meaning through the synaesthetic process of shifting expression across modal boundaries, i.e. transduction. More specifically, as I show, each of several students arrived at qualitatively different understandings as to what she wished to say, and how she wished to say it, as a product of recreating an utterance in a different modal form. Consider the following excerpt from an interview conducted with Emma in Week Six of the MW course:
As mentioned above, the theme of Emma’s piece is a presentation of herself as a bilingual “culture broker”; however, this was not the topic with which she began. Her initial idea, at the outset of the course, was to discuss her experience of being in between two cultures, with one foot in each, as it were. The first, tentative title she chose for her piece was "I am Americanized, but I am Korean." In developing this theme, as part of the process of pre-visualization, she collected several images of the type shown here in Figure 4.
These two illustrations of Janus-faced women were, in effect, products of transduction from the verbal mode to the visual/pictorial. A thematic, metaphorical extension of her original title, the Google search term that Emma entered to find these images (among others, of course) was "two-faced." This, she felt, was an apt linguistic representation, in brief, of the Self she wished to present in her piece. Yet, shortly after discovering and collecting these images, which ostensibly manifest the same idea as the term "two-faced," Emma determined that this was, in fact, not the Self she wished to project. In her own words, as presented in the excerpt above, Emma makes the realization that she does not so much feel "divided into two things" as "mixed" (her second turn above) and, importantly, this crucial difference was apprehended in the visual mode. Though, of course, the realization found its expression in language as well, ultimately. Following on this authorial choice, Emma settled upon the presentation of herself as "Culture Broker," one who bridges two cultures and shares equally in both.
A second case in which a deeper quality of meaning is synaesthetically revealed derives from the multimedia creative practice of Bonnie, who, in an interview conducted in Week Seven, explains part of her multimodal composition process as follows:
Bonnie’s story deals with her experience of being tri-cultural, an "Americanized, Taiwanese Kiwi," as she calls herself. In the preceding excerpt, she describes the moment when she realized that personal interaction and connection are the means by which she permanently, authentically reconciled and maintained these aspects of cultural identity. This realization represents a significant development in her thesis, originally stated as "You don’t have to give up your previous cultural identity when you move to a new country." Bonnie struggled with developing the deeper levels of meaning in her thesis—the "How?" and "So what?"—that her composition instructor required her to divine; but when she laid out and examined the large group of photos she had collected over the years in the different places she had lived, images she imagined might be used in her digital story (see Figure 5 for two examples), the priorities that underpinned the how and so what were virtually, visually suggested to her. As psychologist Rudolph Arnheim (1969) explains, in making visual compositions, even very simple ones, children tend to select for and represent only features that most matter to them, most often unconsciously, rather than creating complete pictures of reality (p. 256). Following Arnheim, I suggest that this is something we all do. Bonnie created a visual composition in the form of a loose collage of photos and decided what did and didn’t belong based on what mattered to her, if unconsciously. As a consequence of the transduction of her ideas into pictures, and the accordant simultaneous and spatial apprehension of textual meaning afforded by visual imagery, Bonnie became consciously aware of, quite literally saw, the importance of personal connection to the maintenance of cultural identity. Thereupon, this new, synaesthetically generated understanding effected a positive transformation in the written linguistic mode, as was the case with Emma.
While I concur with Kress (2003) that "the world told is a different world to the world shown," (p. 1) I would point out, as the cases of Emma and Bonnie exemplify, that the possibility exists for the "world told" to be told in a way that is substantially more powerful and authentic, from the perspective of the author, when it is also shown.
Notwithstanding these apparent benefits to authorship that may obtain in multimodal communicative practice, analysis of my data also suggests that inherent to communication via the New Media may also be hindrances to the expression of agentive voices, ways in which intentionality and authorship may be overpowered by expectations set up by social, semiotic convention and, thereby, effectively genericized. Evidence of such an effect was revealed in the MW data in at least two distinct but interrelated forms, which I discuss in the following sections under the headings Influence of Genre and Over-Accommodation of Audience.
Influence of Genre
It might be said that the common thread that runs through both of the following categories of hindrance to authorial voice in multimodal expression is a kind of subconscious, prescriptive sense of, simply put, the way things are "supposed to be" (Nelson & Malinowski, In press), and as such the way ideas, thoughts, even actual experiences are supposed to be represented, according to convention. By analogy, we might consider this awareness to be an ontological counterpart to Krashen’s (1982) linguistic "monitor": just as language users develop an internally governed sense of acceptability in language usage, it seems that the emergence of a sense of conventional appropriateness and efficacy in representational form is an entailment of extra-linguistic semiotic socialization as well. And of all the semiotic constructs that have evolved to delineate boundaries of representational aptness, genre may be the most powerful, prevalent, and yet seemingly invisible.
Recall from the discussion of Bakhtin above that in any utterance there is a tension between meaning that is ensconsed within the shared semiotic system itself and the intended meanings of the author. Genre conventions are operationalized, in a sense, in the form of expectations about parameters for the proper representation of ideas on the parts of both the producer and receiver of an utterance. For instance, when one recognizes a text as belonging to the genre "action film," one expects its content to adhere to minimum standards of pacing (fast), excitement (high), and so forth; needless to say, one’s expectations for a one-person stage production are likely quite different. Of course, this is not to say that genres cannot be bent; they certainly can be. However, the power and influence of representational convention lies in its seemingly self-evidentiary nature, its innocence, as scholars such as Barthes (1972) and Fairclough (1989) have pointed out. We instinctively feel that a text that recognizably bears even some of the hallmarks of a genre should conform to basic expectations set up by that genre.
That said, in extending the discussion to multimodal communication, it is necessary to add another wrinkle: if each genre invokes expectations in the minds of multimodal meaning makers, both author and audience, what kinds of expectations arise from use of multiple genres and to what semiotic effect?
Among MW participants, this emerged as a prominent concern. Consider the following excerpt from a conversation with Bonnie in Week 17:
Our conversation revealed that the "overlap" that Bonnie speaks of in her third turn above refers to the regular restatement of main thesis ideas throughout the piece, for instance in the form of concluding statements at the end of each paragraph. Bonnie is plainly dissatisfied with the essay-cum-multimedia piece she has created in that it is doing something she feels it is generically not supposed to do, which is to inelegantly disrupt the flow of the action by articulating theme(s) too often and too directly.
As beginning academic writers like Bonnie are told time and again, explicitly stating and restating aspects of one’s thesis is integral to maintaining the coherence and cohesion of an English essay; however, at the end of the multimodal writing process, Bonnie held her essay to different stylistic and formal standards, i.e. those of the genre of film, which calls for an uninterrupted narrative flow of events (albeit stereotypically), a "continuous story" as Bonnie says (her third turn), rather than the hierarchical construction of ideas and obvious telegraphing of main points, a sterotypical exemplar of which being the garden-variety five-paragraph essay. In brief, Bonnie’s dissatisfaction arises from a failure of the essay to behave sufficiently like a movie. Then, from the standpoint of authorial expression, we might say that Bonnie’s narrative purposes could not be entirely accomplished in this piece because this resultant hybrid product of combining features of two distinctly different discourse genres did not sufficiently measure up as a movie, as she had come to expect it should. I argue that this change in Bonnie’s expectations is squarely relevant to the workings of synaesthesia, or, more precisely, to synaesthetic dysfunction.
The transduction of Bonnie’s narrative inspiration from one formal constellation of modal characteristics, the essay genre, into another, the hybrid essay/film, effected a transformation of Bonnie’s beliefs as to how her narrative should properly be realized in text; and, as Bonnie mentions, this transformation of beliefs about how her ideas and experiences should most appropriately be expressed would very likely lead to a transformation in the linguistic form of the narrative were Bonnie given the opportunity to rework her piece. From the standpoint of creativity, this is synaesthetic misfire, in that the multimodal, multi-generic text, to Bonnie’s mind, doesn’t work, i.e. the textual amalgam that emerges, albeit novel, is also characterized by dissonance.
Interestingly, in obversion to Kress (1997), who explains an author as one who assesses the semiotic resources she has available and selects those most apt for her communicative purposes, Bonnie’s example shows how the manipulation of resources can actually re-inspire the multimodal author in mid-stream, so to speak, i.e. effect an alteration in her purposes and design. Certainly though, in avoidance of reductionism, we must acknowledge that the cognizance on Bonnie’s part of the aforementioned disconnect in itself represents powerful evidence of emergent semiotic awareness, which undeniably is a good thing. However, the fact remains that the texts we are required to create are ordinarily not infinitely reworkable; we are most often bound by time, a class period, a semester, etc., and do not revisit and revise them after they have served their purpose. Bonnie’s piece, as far as she was concerned, was not fully efficacious, but done nonetheless.
Over-Accommodation of Audience
Following on the forgoing discussion of genre, I next introduce a related but distinct form of hindrance to authorial voice, one characterized by a self-consciousness about what the audience may expect from the multimodal piece, i.e. the audience’s sense of the way the essay/narrative is supposed to be, which may or may not be rooted in the experience of genre forms, such as in the example described above. Further, as I next illustrate, this self-consciousness or "hyper-awareness" can effect a desire on the part of the author to over-accommodate to the expectations of the audience. In the Week 7 interview excerpt below, Bonnie explains her thoughts on the relation between reality and stereotypical representation:
Here Bonnie discusses an image she has found online of a tower in Taiwan that is very near the place her family once lived (Figure 6). Her second turn in the interview reveals that she is fully conscious of the fact that the tower is not nearly so grand as it may appear—it is “old and rusty,” in fact—and that it is likely misconstrued by outsiders (i.e. non-Taiwanese) as a beautiful, authentic symbol of Taiwan and its culture. Nonetheless, Bonnie uses this image in her piece to signify Taiwan without irony. Recalling the example of Ally in the Recognition of Language Topology section above, we see that the example here bears a resemblance in that these authors each tailor multimodal conjunctions of meaning in their respective pieces so as to maximize transparency of meaning for the imagined audience, that is to appeal most directly to what they each think others would expect. However, what distinguishes these cases, and what qualifies Bonnie’s example as evidence of hindrance to authorship, is that Bonnie has allowed her sense of the expectations of her audience to effectively override what she knows to be accurate according to her own experience. Despite knowing that the tower image constituted a fraudulent representation of her experience of Taiwan, in the final product she let it stand.
Accommodation to one’s audience is not at all exclusive to multimodal forms of representation. In fact, multimodal accommodation might be fairly understood as a Discourse-level11instantiation of what Giles & Copeland (1991), in the arena of language, call "accommodation processes," the natural desire of interlocutors to adjust to or "mirror" the features of one another’s speech. However, what is distinctive, and dangerous, about this kind of multimodal accommodation, I believe, is its tendency to propagate stereotypical forms of representation. Although a verbal signification allows for a good deal of semiotic slippage between a word form and its meaning, poetry would not exist otherwise, in comparison to a word-image sign or imagetext (Mitchell, 1994) the relationship between a word and its meaning is relatively much less easily contestable. For instance, L-O-V-E is the way one expresses the idea of love in English, and this is difficult to argue against. Yet, when one endeavors to express the idea of love with an image, the choices are so numerous, the possibility for slippage so great, that one is more apt to become self-conscious about what is correct, what one is supposed to do. And, as was the case with Bonnie, one too often forgoes the opportunity to make a new meaning, a new metaphor, which is truly expressive of her/his voice in favor of what seems right, i.e. what the audience expects and easily understands. Not to be misunderstood, I do not assert that accommodation is at all intrinsically deleterious to making one’s voice heard; it is, in fact, a necessary feature of successful communication. It is problematic, however, in my estimation if an author elects to forgo making an honest personal statement in favor of telling indeterminate Others what she perceives they want to hear.
POSSIBLE IMPLICATIONS FOR L2 WRITERS
If we do, in fact, allow ourselves to move beyond a language-centered view of communication, a move I have thus far endorsed, and see human semiosis in its full, decidedly multimodal light, we cannot help but accept, as Lo Bianco (2000) asserts, that there is no sense or profit in approaching pedagogies of multiliteracies and multilingualism as separate entities. Within a curriculum of multiliteracies, different languages number among the many diverse resources that are brought to bear in making meaning. "Literacies are legion," as Lemke (1998) emphatically remarks.
Yet, this does not obviate or erase considerations of second language literacy learning in the traditional sense. Yes, it may be that in the near future we will all be "writing with video," as Stephens (1998) predicts. For the time being, however, written language still holds consequential sway, and for L2 writers of English, particularly those making their way within an English-dominant communication culture such as that of the US, the attainment of written English proficiency is a pressing matter. With this in mind, I next explore some possible implications for L2 writers of involvement in multimedia communicative practice.
Particularly, I return to the issue of semiotic awareness. In the particular case of L2 writers, I think that there are compelling reasons for increasing consciousness of the complex means by which meaning is derived in multimodal composition. One such rationale may reside in the relative concreteness that is attendant to multimodal communication, which is to say, that which may be invisible or inaudible in imagery and sound may be rendered visible and audible, i.e. noticeable, as was the case with each of the four participants in this study. Schmidt (1990, 1995) asserts that awareness and the momentary subjective experience of noticing the semiotic properties of language are vitally important to learning. Further, Chun & Plass (1996), in a study of the impact of multimedia annotations on L2 vocabulary acquisition, show evidence of enhanced learning when definitions are accompanied by images. So, with respect to some contexts and curricular goals, keying up the noticeability factor through the practice of multimodal composition could represent a value-added direction in language and literacy education.
Another rationale for involving L2 writers in multimedia practices like digital storytelling might be that it affords the L2 author the freedom to communicate and negotiate meanings by means of media that are not the L2. Recall, in the examples of Emma and Bonnie in the Amplification of Authorship section above, that new understandings arose as a result of transducing semiotic material across modes. L2 writing students are practically always expected to develop (pre-write) and organize ideas using only the target language, in which they often do not have a high level of proficiency, constituting a modally impoverished semiotic environment. Multimodal communication offers a potential leveling effect, an alternative route whereby new understandings can be reached that are ultimately supportive of authorial expression in the L2.
This notion of semiotic impoverishment points to another possible implication. As research in the area of bilingualism has shown, there are a number of affective and cognitive benefits that may accrue to learners who are freely engaged in communication via multiple linguistic codes, for instance a certain cognitive flexibility that comes of understanding concepts according to their various permutations (semantic, orthographic, etc.) within different linguistic and cultural frames (Peal & Lambert, 1962; Bialystok, 1986; Gutierrez, Baquedano-Lopez, & Tejada, 1999). In this I see a parallel with multimodal communication. Aligning myself with the work of Gardner (1973, 1983), Forman (1993) and others, I feel that all forms of representation can be understood as languages, or, more precisely, that language can be seen to share much in common with other forms of representation, like imagery and music. To paraphrase a quotation from Kress offered early in this paper, creativity comes from new combinations. This is the essence of synaesthesia. Logically then, increased semiotic richness and hybridity, both linguistic and extra-linguistic, could only serve to increase the possibility of emergent knowledge, which may in turn positively affect intellectual and affective development. Further, if L2 learners, by virtue of their preexistent multilingualism, may already be reaping the benefits of thinking in terms of multiple codes, then this is a strength that can and should be recognized and fortified through increased modal diversity in communicative practice. (cf. Meskill, Mossop, & Bates, 2000)
I am hopeful about the social and pedagogical potentialities that may accord to L2 multimedia communication; however there are certainly pitfalls that are difficult to avoid. For example, as the case of Bonnie in the Over-accommodation of Audience section above plainly illustrates, I have observed that multimodal authors, in the face of limitless choices and prompted by convenience, time constraints, etc., often find and settle upon elements, particularly visual elements, for their compositions that they consciously know are not apt for their purposes. A second potential problem relates to the logistics of the creation of projects like these. As I mention above, the creation of these digital story projects took six weeks longer than I had originally planned; and while I feel that was time well spent, I also recognize that educators must be vigilant not to allow language learning objectives to be eclipsed by the task of completing a complicated project.
On the subject of authorship and authority in the hyper-visual multimedia world of today, Kress (2003) offers the following: "The affordances of the new technologies of representation and communication enable those who have access to them to be 'authors,' even if authors of a new kind – that is to produce texts, to alter texts, to write and to 'write back'" (p. 173). His point is well taken; now anyone with access to a computer can easily create and publish dynamic multimedia texts and be an author in a very real sense. And the results of the study presented herein suggest, I hope, that multimedia authorship holds great potential for increasing the quality and volume of authorial voice, particularly of the voices of those who may not yet have gained the ability to fully express themselves in an L2.
However, power tools do not necessarily a carpenter make, to coin a phrase. To engage synaesthesia in its truly creative sense, one must not only understand the tools and the codes of the new media age; one must understand how to recombine these communication resources so as to bend them to her/his expressive will. Designing relationships is the key (Kern, 2005).
Effectively teaching this form of awareness and skill is no small task, as we educators likely do not possess these ourselves. This is the task before us, nonetheless. Not to be misconstrued, I am by no means an unabashed cheerleader for technological innovation. However, I take it is a fact that communication has changed and is changing, and technology and multimodality are inextricably linked to those changes. This is not at all to say that language and traditional literacy learning have become less important. But we must become multimodally, multimedially "multi-competent" (Cook, 1991, 2002). Now more than ever we, our students and ourselves, need the highest level of understanding of the semiotic workings and affordances of language, as well as of other modes, in order to enact and facilitate powerful personal expression. Roland Barthes’ dead author is resurrected and thrives in the intentional, complex, powerfully synaesthetic designs of prepared minds.
1. Jewitt (2004) makes a useful, important distinction between the oft-confused terms medium and mode: "Medium refers to technologies of dissemination, such as printed book, CD Rom, or computer application. Mode refers to any organized, regular means of representation and communication, such as still image, gesture, posture, and speech, music, writing, or new configurations of the elements of these" (p. 184). It is with this understanding that I proceed.
2. "New Media Age" is a phrase coined by Kress (2003) to describe the increasingly multimodal, image-dominant communication landscape of the present and future. It was Marshall McLuhan (1964), however, who first popularized the use of the term "media" in the sense in which it is now commonly understood.
3. Synaesthesia, as the term is used here, is not to be confused with the medical sense of the term. An explanation of this distinction follows.
4. For selected historical and academic treatments of digital storytelling, see Lambert (2002), Hull & Nelson (2005), and Hull & Katz (In press).
5. The question of whether there is such a thing as a "same idea" that is expressed differently in different modalities is an enormously complex and contentious one, and resolving it is far beyond the scope of this paper. As such, I must bracket this issue and proceed with the common sense understanding that there is such a thing as a core meaning that may be expressed in a multiplicity of modes and thereby transformed.
6. Each of the names used here is a pseudonym.
7. Pursuant to the regulations set out by the Committee for the Protection of Human Subjects at the University of California, Berkeley, formal written permission was obtained from each participant. Moreover, each participant was given the opportunity to specify the particular uses to which her work might be put, and each specifically agreed to have her work published as part of this article.
8. By "pre-writing/pre-visualization, rough construction, and completion phases of the creative process" I respectively mean the point at which ideas for the story are being simultaneously generated by means of manipulating language, imagery, sound, etc.; the point at which the rough correspondences of image and language have been tentatively arranged; and the point at which the piece is more-or-less complete.
9. To my mind, Iedema’s resemiotization is a kindred concept to Butler’s (1997) re-signification and Stein’s (2000) re-sourcing.
10. In the interest of simplicity of illustration, I chose not to include fine-grained transcriptions herein. However, there are two transcription conventions that the reader should be aware of: italics are used to indicate stress, and a single slash (/) indicates and instance of self-interruption.
11. I use "discourse" here in the socially pervasive, ideological "big D" sense that Gee (1999) intends.
I would like to gratefully acknowledge Claire Kramsch, Richard Kern, David Malinowski, and three anonymous reviewers for their indispensable advice in preparing this article. I would also like to extend my great appreciation to the Berkeley Language Center, which generously funded my project, and Jane Stanley, Michelle Winn, Margi Wald, and Pat Steenland of the Berkeley College Writing Programs, whose help and cooperation I could not have done without. Most of all, thanks so much to Ally, Bonnie, Carrie, Donna, and Emma for their wonderful work and commitment throughout the project, especially over those extra six weeks.
ABOUT THE AUTHOR
Mark Evan Nelson is currently a PhD student in Education in Language, Literacy, and Culture at UC Berkeley, and his research investigates interactions among written and oral language, visual imagery, and other symbolic systems within multimodal texts. Mark’s teaching experiences include several years as a K-12 art teacher and a five-year tenure as an EFL lecturer in Japan.
Arnheim, R. (1969). Visual thinking. Berkeley, CA: University of California Press.
Bakhtin, M. (1981). The dialogic imagination. Austin, TX: University of Texas Press.
Barthes, R. (1972). Mythologies. New York: Hill & Wang.
Barthes, R. (1977). Image-music-text. New York: Hill & Wang.
Bialystok, E. (1986). Factors in the growth of linguistic awareness. Child Development 57: 498-510.
Bolter, J. & Grusin, R. (2000). Remediation: Understanding new media. Cambridge, MA: MIT Press.
Brown, A. L. (1992). Design experiments: Theoretical and methodological challenges in creating complex interventions in classroom settings. Journal of the Learning Sciences 2, 2: 141-178.
Butler, J. (1997). Excitable speech: A politics of the performative. London: Routledge.
Chun, D. & Plass, J. (1996). The effects of multimedia annotations on vocabulary acquisition. Modern Language Journal 80: 2, 183-198.
Collins, A. (1992). Towards a design science of education. In E. Scanlon & T. O’Shea (Eds.), New directions in educational technology (pp. 15-22). Berlin: Springer.
Cook, V.J. (1991). The poverty-of-the-stimulus argument and multi-competence. Second Language Research 7: 2, 103-117.
Cook, V.J. (2002). Background to the L2 user. In V.J. Cook (Ed.), Portraits of the L2 User (1-28). Clevedon: Multilingual Matters.
Cope, B. & Kalantzis, M. (2000). Multiliteracies: The beginnings of an idea. In B.Cope & M. Kalantzis (Eds.), Multiliteracies. London: Routledge.
Cytowic, R. (1993). The man who tasted shapes: A bizarre medical mystery offers revolutionary insights into emotions, reasoning, and consciousness. New York: Tarcher/Putnam’s Sons.
di Sessa, A. A. (1991). Local sciences: Viewing the design of human-computer systems as cognitive science. In J. M. Carroll (Ed.), Designing Interaction: Psychology at the Human-Computer Interface (pp. 162-202).New York: Cambridge University Press.
Fairclough, N. (1989). Language and power. London: Longman Group UK Ltd.
Finnegan, R. (2002). Communicating: The multiple modes of human interconnection. London: Routledge.
Forman, G. (1993). Multiple symbolization in the Long Jump project. In C. Edwards, L. Gondini & G. Forman (Eds.), The hundred languages of children: The Reggio Emilia approach to early childhood education. Norwood, NJ: Ablex Publishing.
Gardner, H. (1973). The arts and human development. New York: Wiley.
Gardner, H. (1983). Frames of mind: The theory of multiple intelligences. New York: Basic Books.
Gee, J.(1999). An introduction to discourse analysis: Theory and method. London: Routledge.
Gee, J. (2003). What video games have to teach us about language and literacy. New York: Palgrave/St. Martins.
Gibson, J. (1979). The ecological approach to visual perception. Boston: Houghton Mifflin.
Giles, H. & Coupland, N. (1991). Language: Contexts and consequences. Milton Keynes, UK: Open University Press.
Gutierrez, K., Baquedano-Lopez, P. & Tejada, C. (1999). Rethinking diversity: Hybridity and hybrid language practices in the Third Space. Mind Culture and Activity 6, 4: 286-303.
Hull, G. & Katz, M.- L. (In press). Crafting an agentive self: Case studies on digital storytelling. Research in the Teaching of English.
Hull, G. & Nelson, M. (2005). Locating the semiotic power of multimodality. Written Communication 22, 2: 224-261.
Iedema, R. (2003). Multimodality, resemiotization: Extending the analysis of discourse as multi-semiotic practice. Visual Communication 2, 1: 29-57.
Jewitt, C. (2004). Multimodality and new communication technologies. In P. Levine & R. Scollon (Eds.) Discourse and technology: Multimodal discourse analysis (pp. 198-195). Washington DC: Georgetown University Press.
Kendon, A. (1991). Some considerations for a theory of language origins. Man 26: 199-221.
Kern, R. (2005, February). Multiliteracies and foreign language education. Paper presented at the Berkeley Language Center Colloquium, University of California, Berkeley.
Krashen, S. (1982). Principles and practice in second language acquisition. Oxford: Pergamon.
Kress, G. (1997). Before writing. London: Routledge.
Kress, G. (2003). Literacy in the new media age. London: Routledge.
Kress, G. & Van Leeuwen, T. (1996). Reading images: The grammar of visual design. London: Routledge.
Kress, G. & Van Leeuwen, T. (2001). Multimodal discourse: The modes and media of contemporary communication. London: Arnold.
Lanham, R. (1993). The Electronic Word: Democracy, Technology, and the Arts. Chicago and London: The University of Chicago Press.
Lambert, J. (2002). Digital storytelling: Capturing lives, creating community. Berkeley, CA: Digital Diner.
Lemke, J. (1998). Metamedia literacy: Transforming meanings and media. In D. Reinking, M. McKenna, L. Labbo, & R. Kieffer (Eds.) Handbook of literacy and technology: Transformations in a post-typographic world. Mahwah, NJ: Lawrence Erlbaum.
Lo Bianco, J. (2000). Multiliteracies and multilingualism. In B.Cope & M. Kalantzis (Eds.), Multiliteracies. (pp. 92-105). London: Routledge.
McLuhan, M. (1964). Understanding media: The extensions of man. New York: McGraw-Hill.
Meskill, C., Mossop, J. and Bates, R. (2000) Bilingualism, cognitive flexibility, and electronic Texts. Bilingual Research Journal 23, 2&3.
Mitchell,W. J. T. (2004, April). Sounding the idols. Paper presented at the Conference on Visual Culture, University of California, Berkeley.
Mitchell, W.J.T. (1994). Picture theory. Chicago: University of Chicago Press.
Murray, J. (1997). Hamlet on the holodeck: The future of narrative in cyberspace. New York: The Free Press.
National Council of Teachers of English Commission on Composition. (1984). Teaching composition: A position statement. College English 46, 6: 612-614.
Nelson, M. & Malinowski, D. (In press). Identity and hegemony in multimodal discourse. In M. Mantero (Ed.), Perspectives on language studies: Identity, culture, and discourse in educational contexts. New York: Information Age Publishing.
New London Group. (1996). A pedaogogy of multiliteracies: Designing social futures. Harvard Educational Review, 66: 60-92.
Peal, E. & Lambert, W. (1962). The relation of bilingualism to intelligence. Psychological Monographs 76: 1-23
Peirce, C.S. ( 1955). Philosophical writings of Peirce. New York: Dover Publications.
Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11, 129-158.
Schmidt, R. (1995). Consciousness and foreign language learning: A tutorial on the role of attention and awareness in learning. In R. Schmidt (Ed.), Attention and awareness in foreign language learning (pp. 1-63). University of Hawai'i at Manoa, Second Language Teaching and Curriculum Center, Technical Report # 9.
Stein, P. (2000). Rethinking resources: Multimodal pedagogies in the ESL classroom. TESOL Quarterly 34.
Stephens, M. (1998). The rise of the image, the fall of the word. New York: Oxford University Press.
Tufte, E. (1997).Visual explanations: Images and quantities, evidence and narrative. Cheshire, CT: Graphics Press.
Van Lier, L. (1995). Introducing language awareness. London: Penguin Group.
Contact: Editors or Editorial Assistant
Copyright © 2006 Language Learning & Technology, ISSN 1094-3501.
Articles are copyrighted by their respective authors.