We don’t yet know what the full social and cultural impacts of Machine Learning text and image generators will be. How could we, the damn things are only a few months old, the important impacts haven’t even started to happen yet. What we do know is this: Machine Learning image and text generators are a lot of fun to play with.
It’s that pleasure I want to briefly reflect on today. Playing with something like DALL-E, or Stable Diffusion, or ChatGPT is for me reminiscent of the kind of slot-machine loop a game like Minecraft sets up. In Minecraft, you spend an hour whacking away at cartoon rocks with your cartoon pick for the reward of occasionally coming across a cartoon diamond. When you’re playing with Stable Diffusion you spend an hour plugging in various prompts, trying different settings, using different random seeds, for the reward of occasionally generating something that strikes you as pleasing.
What’s pleasing to me about the images I come across in this way is, often, how they capture an idea that I could imagine but not realize (as a visual artist of very little talent). In this sense that they translate ideas into artistic work without the intervening phase of mastering an artistic medium of expression, image generators call to mind the idea of “Dry Dreaming” from William Gibson’s short story “The Winter Market.”
In this short story, which prefigures in many ways Gibson’s later Sprawl novels, Gibson imagines a technology that basically reads the minds of artists (with the mind-machine interface of a net of electrodes familiar to many cyberpunk stories) and outputs artistic vision directly to a machine recording that can then be edited and experienced by an audience. At one point, the main character of the story muses about how this technology allows artistic creation by those lacking traditional artistic skill:
you wonder how many thousands, maybe millions, of phenomenal artists have died mute, down the centuries, people who could never have been poets or painters or saxophone players, but who had this stuff inside, these psychic waveforms waiting for the circuitry required to tap in....
On the surface, DALL-E and Stable Diffusion (and text generators like GPT-3, though my own personal experience of this is different since I’m a bit better with text) seem to do just this. Let us create direct from ideas, jumping over all the fiddly body-learning of composition and construction.
But of course, there is a crucial difference between the imagined and actual technology. The “dry dreaming” Gibson imagined was basically imaging a short cut around the semiotic divide between signifier and signified: it exported meaning directly from a person’s brain to a recording. Let’s leave aside for a moment if such a thing would ever be possible, I think we can perhaps still relate to the desire behind the dream. If we’ve ever struggled to put an idea down in words, we understand the fantasy here. Just take the idea out of my head and give it to someone else directly!
But DALL-E and Stable Diffusion very much do not take ideas directly from the user’s head. They take a textual set of signs from the user, and give back a visual set of signs based on what they have learned by statistically correlating all the sets of textual and visual signs they could find. What they do is, in fact, almost the opposite of what Gibson imagined with dry dreaming. Instead of direct transfer of signified with no distorting signifier in the way, they are dealing with the pure play of signifiers, without the weight of meaning to slow them down.
Of course, the signified re-enters the picture in the moment that I, the user, select an image and think “oh yes, that’s what I meant!” or even “oh wow, that’s what that could mean!” But of course, those reactions happen in the presence of the sign already drawn for us, the re-imagined imagination of the vast set of signs that were the training data for the machine.
That moment is a lot of fun, but the change it heralds for meaning itself is at least as profound as those brought about by recording technology and documented by Kittler in “Gramophone, Film, Typewriter.” If recording allowed for the remembering of signs without the intervention of human meaning-making, then machine learning generators may allow for the creation of signs without the intervention of human meaning-making.
What that does, I don’t think any of us know yet. But it does something.
Anytime I encounter a new technology I like to knock around the phase space of it’s possible outputs a bit, see what you produce if you take it through the range of possible values for various settings or inputs. I take my inspiration for this from a photography project I distinctly remember encountering years ago, but which I can no longer find or recall the name of, which did this process with a camera: taking multiple images of the same subject while stepping through possible f-stop and shutter speed values. If anyone recognizes this project, please let me know what it was!
I think I’m drawn to these phase space experiments because they help me get a concrete sense of what a technology does. I’m not always a great abstract learner, I have a clearer sense of what’s happening once I get my hands dirty and try stuff out a bit. That’s why I’ve been wanting to try this with one of the machine-learning text-to-image programs for awhile now. These programs (which you’ve probably encountered in the form of DALL-E or one of it’s cousins) are fantastically hard to understand in the abstract, because they rely on hugely complex statistical manipulations to generate images from text.
The quality of images this software can produce has progressed almost unbelievably rapidly over the last year. For example, about a year ago, I asked the then-hot version of a text-to-image generator (CLIP/VQGANS, I think it was called) to draw me “Professor Andrew Famiglietti of West Chester University” and got this:
Whereas the current hot image generator, Stable Diffusion (which is available free of charge and will run reasonably well even on my modest GTX 1060 graphics card) renders output for that prompt that looks like this:
More importantly, at least insofar as my fascination with technological phase space is concerned, Stable Diffusion makes it easy to tweak a couple of settings that influence how it makes images.
To understand what these settings are (at least in the vague way that I understand what they are) we have to quickly review how machine learning image generators, well, generate images. So far as I understand, they work by using a system that has been trained to recognize images on a vast set of image-caption pairs. That is, they learn what a “cat” looks like by seeing a very, very large number of images labelled “cat.” (For a great discussion of the Stable Diffusion training data, and some links to explore that data further, see this blog post by Andy Baio.) The image generator starts with random visual noise, and uses the recognition algorithm to detect what pieces of that noise are most like it’s prompt, and then iteratively modifies the image to increase the recognition score. You can see the process at work in this .gif, which shows the steps stable diffusion uses to draw a cat, basically sculpting the most “cat like” pieces of noise into a more and more defined cat image:
Stable Diffusion gives us access to two settings that let us guide this process:
Inference steps sets the number of times the algorithm will repeat the process described above, in other words how many iterative image “steps” it will generate on the way to a final image.
Guidance Scale (or CFG) determines how strictly the algorithm revises the image towards the given prompt. I’m honestly a little fuzzy about what this means, but higher values are said to make the algorithm interpret the prompt more “strictly.”
So, what does the phase space of these two settings look like? Well, if we ask Stable Diffusion to draw us several versions of “a black and white photograph of a galloping horse” (as an homage to Muybridge’s “The Horse in Motion”, which does some phase spacy work itself) using low, average, and high values for steps and guidance scale and arrange the nine resulting images in a grid, with low values on the upper left and guidance scale increasing as we go from left to right and steps increasing from top to bottom, we get this:
This gives us a rough sense of the space. Low guidance gives us vaguely horselike shapes, and low steps gives us a “sketchy” unrefined image. Moderate guidance and steps (the recommended settings for “realistic” results) give us, well, a horse. Very high steps and guidance give us a horse with a LOT of detail (not all of the details really make sense though) and LOT of contrast (including an odd, glowing bit of light on the back). The presence of all four feet in this image is interesting, but as we’ll see, not entirely a predictable result of the settings. The other two corners: low guidance, high steps and high guidance, low steps, are perhaps most interesting, from a glitch art perspective. More on these in just a bit.
If we invest a bit of time (and a month’s allotment of Google Colab compute credits) we can expand the above into a much larger grid, slowly incrementing over both Guidance Scale and Inference Steps from very low (Guidance Scale 0 and 3 Inference Steps) to very high (Guidance Scale 16 and 100 Inference Steps) a small step at a time.
The resulting grid looks like this, again low values for both steps and guidance scale are on the upper left, step values increase as you move down the image, and guidance scale values increase as you move from left to right.
Several interesting and informative features emerge from a scan of this large phase-space grid.
First, a few of the images are missing! As I’ve since learned, Stable Diffusion has a built in algorithm that attempts to censor “NSFW content” (our era’s telling euphemism for obscenity). The somewhat oversensitive nature of this algorithm can be seen in how it triggers on some random frames, with nothing particularly suggestive in any of the surrounding images:
I’ve since learned to disable the NSFW filter, but just the method of action here is fascinating. Basically, a machine learning system generates an image, then passes it through another machine learning system to see if the image is recognized as obscene. Of course, since generators are based on recognition systems, this does kind of suggest that someone could wire up the obscenity filter to create an obscenity generator, but this disturbing notion will be left as an exercise for the reader.
Second, the features rendered by the algorithm are incredibly mutable and ephemeral. A few steps more or less, or a bit more or less “guidance” can cause significant changes in the image. These changes don’t seem to follow an easily discernible pattern, instead features may emerge for a range of settings then disappear. Most notably, the horse’s missing legs come and go at various points in the sequence. Here a leg emerges for a single image in the step sequence at a moderate Guidance Scale, along with some motion blur (another idea the algorithm seems to occasionally toy with and then discard), before disappearing again:
At a very high Guidance Scale, the leg reappears more consistently, but this phase space experiment makes me doubt that the high guidance scale has made the image “better” or “more accurate.” The process of using Gaussian noise to draw an image seems to just riff on certain image features for awhile and then drop them.
Finally, there are those two corners of the space I called interesting before, the bottom left, where high-iteration, low guidance images live, and the upper right, where low-iteration, high guidance images dwell.
The low-guidance, high iteration images are nonsense, but oddly realistic nonsense. The algorithm draws a very solid, photo realistic picture of some totally impossible shapes. Take the comparison below, for example. With the slider all the way right, it shows the generation scale 0 image at 50 iterations, all the way left is 100 iterations. The image is only subtly different, but seems more solid. The “scene” the algorithm has hallucinated (some sort of city street? A market?) seems to have more depth.
The oddly human figure on the lower right of this image (which becomes incorporated into the front half of the horse with stricter guidance given to the algorithm) is also intriguing to me. These emergent human figures we might dub “The Generation Scale Zero People,” and further experiments with Phase Diffusion suggest they are easy to generate.
Further examples coming soon to a Mastodon bot I’m building. As I was generating these, I also experimented with some prompts that asked the generator to create something other than a photograph: for example a line drawing, charcoal sketch, or painting. These tended to create loving renders of the technique (brush strokes, pencil lines) with subjects that seemed odd, fanciful, or even metaphorical.
I find these images somewhat evocative, despite the fact that I know just how little I really did to generate them. Basically, these are the result of a script I wrote that generates random image generation prompts from terms entered into a spreadsheet (modeled on the SSBot twitter bot application by Zach Whalen). I gathered a bunch of those, ran them on low Generation Scale and high Inference steps, and picked the ones that spoke to me.
These images are compelling to me because they seem absurd in a pleasing way. It’s this automatic generation of absurdity, let’s call it: Procedurally Generated Nonsense which I find the most fascinating thing about AI Image Generation. In the late 19th and early 20th century, the technologies of “mechanical reproduction” made the creation of sensible texts all too easy. Legible text, clear “realistic images,” all became something easy to make and easy to copy via machine. For at least some artists, the response was to reject sense and embrace nonsense, sometimes leveraging the affordances of these same technologies to create images that were anything but “realistic.” Instead, they embraced the absurd, the garish, the non-nonsensical, the fantastic.
There is a way in which the AI image generator and it’s ken seem to stand this equation on its head. Yes, they produce nonsense, but they often produce compelling nonsense quickly, easily, almost thoughtlessly. As such, they effectively automate a domain of art embraced exactly because it seemed to resist earlier forms of automation.
I’m not sure I like that. I’m not sure where that goes. But, in the meantime, I can’t stop asking my little machine-mind to dream me more absurdity.
So, I’ve been thinking about online instruction a lot lately. Mostly, that’s because I think it’s fairly possible that at least some amount of instruction at my University could end up being online this fall, for a lot of different reasons. It’s not a sure thing, but it seems worth planning for.
While I’m a fairly digital guy, I have, in the past, resisted teaching entirely online classes because (as I explain below) I think the standard model of online instruction is a poor fit for the subjects I teach (especially first-year composition) and my pedagogy. The coronavirus, however, looks like it could force my hand, so I’ve gone looking for resources that I could use to build thoughtful, well designed online/hybrid instruction that better meets my needs. In particular, I was drawn to the learning theory called “connectivism,” which I was vaguely aware of as the basis for connectivist MOOCs (cMOOCs): a form of massive online class that people I respected seemed to think of as less creepy, awful, and disengaged than the big name MOOCs.
What follows are my initial thoughts about connectivism and online composition pedagogy. Connectivism isn’t the shiny, new thing it was back in 2004, but I think it offers us a useful way to push back against the standard model of online teaching as “content delivery” and start building classes that function as communities where students learn to learn and practice real-world writing skills.
The Standard Model
At the risk of oversimplifying, here’s my sense of the standard model of “good online class design,” based on what I’ve gleaned from various faculty development activities, blog posts, etc.
The basic goal in this model is to break instructional content down into small “chunks,” deliver those chunks in an “engaging” way (usually video), and then do some sort of formative assessment activity (usually a quiz) to reinforce the content delivered in the content chunk. The classic Sebastian Thrun Udacity course on Statistics (Full Disclosure: I’ve failed to finish that course at least twice) is one good example of the model, with Thrun giving short explanations of statistical concepts in 3-5 minute videos using a virtual whiteboard, interspersed with short quizzes to follow up on the material. At my institution, faculty are encouraged to build “modules” in the LMS to deliver online instruction in a similar format.
I have some criticisms of this model, but first let’s admit that it represents a big improvement over the typical form of online instruction we saw during “emergency online instruction” this spring, which often amounted to little more than slapping students into a Zoom room and letting an instructor approximate his or her in-person class in the video call. As a method for delivering the sort of instruction that would traditionally be provided by a large lecture, the standard model probably works well enough (although, to be honest, I’m sometimes mystified by the persistence of the large lecture as a method of instruction).
However, for a lot of teaching, especially in subjects like composition, the standard model has some problems. Composition teaching fundamentally isn’t about learning content, it’s about practicing and developing skills. Furthermore, the standard model tends to have lurking in the background the idea of building online content that can be re-used across multiple classes and “scaled” to meet the needs of many, many students. While this kind of “reusable” and “scalable” content may have economic benefits for institutions, it tends to interrupt the building of relationships between students and faculty that leads to the attentive care that we know helps students (especially students from traditionally marginalized populations) succeed. A video content module may be entertaining but it doesn’t care about you, or your development as writer, much less a person. The standard model also tends to frequently veer into advocating for pervasive surveillance of learning spaces (via exam proctoring services, the surveillance features of the LMS, etc) in order to make up for the lack of student-teacher relationships and try to ensure that students are remaining “engaged” with the content being delivered and avoiding cheating and academic dishonesty. These surveillance techniques also undercut the trusting relationships that would enable the kinds of learning composition classrooms need.
Connectivism, A Better Model for Teaching Online?
If the apotheosis of the standard model of online teaching as “content delivery” is a Massive Open Online Class (MOOC) like those offered by Udacity, then the alternative cMOOCs (which were, in fact, the original MOOCs) seem to offer an alternative. The cMOOC stresses decentralized activity on the open web over content delivered via an LMS. Stephen Downes, one of the co-organizers of the first cMOOC, described the course as “not simply about the use of networks of diverse technologies; it is a network of diverse technologies.” The class “Connectivism & Connective Knowledge” utilized a blog, a wiki, twitter, moodle, and live class sessions on a platform called elluminate (since acquired and then “depreciated” by Blackboard). The class encouraged students to create, post, remix, and share their own content in class conversations and social media sessions, rather than simply consume content from instructors.
The “Connectivism & Connective Knowledge” MOOC serves as a pretty good illustration of the overall shape of connectivism as a pedagogical mode. As George Siemens explains in the original 2005 article that coined the term, connectivism holds that “Learning is a process that occurs within nebulous environments of shifting core elements – not entirely under the control of the individual. Learning (defined as actionable knowledge) can reside outside of ourselves (within an organization or a database), is focused on connecting specialized information sets, and the connections that enable us to learn more are more important than our current state of knowing.” Rather than attempt to optimize content delivery to students, connectivist online teachers ask students to engage in networks and practice the method of building knowledge within those networks. For connectivism, that learning is not just content internalized by a learner, but also the network of human (fellow learners, teachers, experts) and non-human (databases, search tools) that learners build while gathering, evaluating, and producing content.
For me, the advantage of this approach to composition teaching seem clear. The “learning to learn” approach that is typical to many composition classrooms, where students are encouraged to practice the process of encountering new genres and adapting their writing to meet the needs of new rhetorical situations (rather than memorizing “rules” to guide the production of a fixed set of genres), matches well with the connectivist focus on learning as an ongoing process. As Siemens puts it “The pipe is more important than the content within the pipe. Our ability to learn what we need for tomorrow is more important than what we know today.” Furthermore, the focus on the creation of content, rather than the consumption of content, fits nicely with what we typically want students to do in the composition classroom.
That said, connectivism is not without its failings. The original 2005 article is very much an artifact of its time, with an emphasis on structureless “self-organization” and “decentralization” which feels decidedly dated now. The “Connectivism & Connective Knowledge” MOOC relied heavily on RSS, the once great, now mostly forgotten, technology that allowed readers to gather content from widely distributed blogs and other publishers into a single reader app. No technology is more emblematic of the distributed web we lost than RSS.
Nonetheless, I think we can (re)construct a usable online pedagogy for composition by combining some of the insights of connectivism (in particular, the way it stresses helping learners practice to use online tools to build connections and remix, author and share content) with the practices of mentorship and care typical of a more traditional classroom setting. This requires abandoning the idea that online learning can “scale” to be “massive” (an idea both MOOCs and cMOOCs embrace) and instead accepting that education will remain a labor-intensive process of building relationships and care-work.
Starting Ideas for a Connectivism Inspired Composition Class
As I plan my own sections of Intro to Composition this fall, I’m thinking about what the kind of caring connectivist pedagogy I imagine above would look like in practice. For me, a big piece of this planning involves embracing the possibility of a greater role of online/hybrid spaces as a potential strength for a composition class, as these spaces require students to compose their participation in the class itself via writing or electronic media. This makes the practice of class participation, if managed well, an opportunity to engage in composition skills.
With that in mind, what I’m doing now for the fall is brainstorming ways to make the class a community of composition, leveraging electronic tools to allow students to write to and for each other. I think this will rely on early, non-assessed writing assignments (inspired in part by Peter Elbow’s “Ranking, Evaluating, Liking: Sorting Out Three Forms of Judgement”) students are encouraged to share what forms of composition they care about and regularly consume. I also may attempt to subdivide the class into smaller communities of interest to allow students to compose within an environment of shared passions and ideas. As they go, students would progress from composing less-assessed “discussion” pieces designed to build community with me and their peers, to more structured compositions designed to achieve some purpose they had defined together.
In order to give this shared process some structure, I’ve been brainstorming shared tasks that students could only successfully complete through effective communication and collaboration. The example of how gamers everywhere use wikis and other forms of collaborative communication to share strategies for solving various kinds of puzzles and challenges in games seems like a potentially productive example for how to do this. Learning communities, online and off, that revolve around hobbies from sourdough baking to hairstyling to gardening also serve as potential models. The idea of emulating what effective online learning communities already do is a key insight of connectivism, and one worth continuing to explore (even as we admit the limitations of those “spontaneous” communities).
As students build these shared compositions, I think it makes sense to ask them to reflect on the networks they are building. What tools do they employ to complete tasks and how do the affordances of those tools shape their writing? Who are the people who model the kind of composition they want to do, and who might make up a support network (online and off) they can draw off of as they compose (and then support in turn)?
Furthermore, as I build a class as “community” I’m trying to avoid romanticizing “community” as a safe, stable “home place” where people always feel comfortable, and instead thinking about how to build a community that reflects how real-world communities require compromise and moments of being uncomfortable, especially for the relatively privileged.
This kind of communal, shared, active online/hybrid environment seems to me more promising than the standard model of “content modules.” Whatever form our teaching takes this fall, I think I’ll benefit from spending time thinking through this, and I’d love to hear from others who would like to think about assignments and class activities that could fit this kind of model.
Folks in higher-ed, much like everyone else, are trying to figure out what their post-coronavirus reality will look like. This piece, by Stern School of Business professor Scott Galloway, is one example of the now capacious genre that is “post-corona higher-ed crystal ball gazing lit.” It’s worth a read, though some who read blogs by people like me may find it’s ideological position… icky.
It’s precisely the ideological cringe that makes the piece worth reading, however. Galloway, who flirts with the sort of almost-consciousness of his own ideology one typically finds on r/selfawarewolves, does an excellent job of predicting what will happen if current ideological conditions hold. Since current ideological conditions tend to hold, especially in the absence of a concerted effort to change them, this makes his analysis an excellent starting point.
His prediction? That the decline of the on-campus experience and the rise of distance education during the era of COVID-19 will lead to the biggest names in higher education (MIT, Stanford, UCLA, Etc.) partnering with big tech to essentially eat the higher education space the same way that Amazon has eaten retail, Spotify has eaten music distribution, or (perhaps the best analogy) NYT and WaPo digital have eaten newspaper journalism. Thousands of brick and mortar universities will be swept away and, in their place, will stand a small oligopoly of powerful “brands” delivering online education. His reasoning is that, in the absence of the on-campus experience “Nobody’s going to enroll in Pepperdine if they get into UCLA.”
Let’s admit here that, as obnoxious as this logic may be, it contains an element of truth. Our institutions have, in some cases, leaned on selling “the campus experience” and neglected the work of education. Nothing demonstrates that more profoundly than the terrible conditions under which the vast majority of the people doing the actual work of education on our campuses, contingent faculty, work, all while lavish investments are made in facilities and administration.
At the same time, however, the market logic that suggests that “the best education” can be reduced to a form of content delivery and “scaled” to permit many, many more students access to “the best” professors at “the best” universities recalls the failed MOOC experiment of the last decade. What MOOCs discovered was that most students in “massive” classes simply dropped out. They dropped out because the real work of teaching isn’t content delivery, it’s building relationships and engaging in care. This can be done remotely, but it can’t be done at scale. Care work is stubbornly, expensively labor intensive.
For me, this means that, to survive and even thrive in the age of COVID-19, educational institutions need to figure out ways to do the care work of education at a (social) distance. This is fundamentally at odds with the affordances of many Officially Sanctioned digital education tools (like the LMS) which are fundamentally designed around a paradigm of content delivery and surveillance. It is also at odds with slashing the instructional workforce, especially our contingent faculty who do so much of the hands-on work of teaching. I’ll have more thoughts about how to do this in a future post.
As for big tech? Here’s a modest proposal. Apple has something like 200 billion dollars in cash-on-hand. That’s enough to pay all 700,000 adjunct faculty in the United States a salary and benefits package worth $100,000 a year for 3 years. Want to disrupt higher education? Do it by hiring and paying educators. Tech has the money. Seize the moral high ground, it’s yours if you want it.
**WARNING: THIS POST SPOILS THE OUTER WILDS QUITE BADLY. DON’T READ THIS IF YOU INTEND TO PLAY THE OUTER WILDS**
The Outer Wilds, a relatively recent release by indie game producers Annapurna Interactive and Mobius Games, earned enough press attention and critical acclaim that I decided to spend some of my decidedly limited gaming budget (both in terms of time and money) on it. During the course of playing The Outer Wilds, there were times I found myself regretting this decision, as it could be a deeply frustrating game. Ultimately, however, I found The Outer Wilds one of the most rewarding games I’ve ever played. Reaching the end of The Outer Wilds (which I will utterly spoil here, last warning) felt deeply rewarding, like the well-earned end of a carefully crafted story.
That story begins with a simple, well-crafted hook. You are an astronaut from a cartoonish alien species living in a solar-system of tiny, quirky planets. Your mission is to use a translator device to discover the secrets of an extinct alien race that inhabited your Solar System long ago. Twenty minutes into your mission, the Sun explodes, destroying everything. You, however, wake up twenty minutes in the past, starting all over again. The goal seems clear enough, solve the mystery of the time loop, stop the sun from exploding, save the world.
That simplicity is a bit of an illusion. Along the way is a fair amount of often frustrating puzzle solving. A big driver of this frustration was the game’s stubborn refusal to engage in the usual “leveling up” mechanics of most contemporary games. You never get better, more powerful equipment in The Outer Wilds,you never learn any new skills (save one, a fellow astronaut, the only one who is also aware of the time loop, teaches you how to “meditate until the next cycle,” effectively letting the player skip ahead to the next loop without actively committing in-game suicide) you never find any crafting recipes to create cool new stuff. You just solve puzzles, often times via fairly difficult platformer-esque jumping, dodging and maneuvering.
The decision not to allow the player to auto-magically become more powerful over time was a clear rejection of the ideology of contemporary games on the part of The Outer Wilds. I respect that decision, even though it almost made me quit playing several times. The fantasy of endlessly acquiring power and gear and levels is a decidedly late-capitalist one, as it encourages the player to value a logic of accumulation and domination. That said, the pleasures of this fantasy are all the more apparent when it was taken away. As I played The Outer Wilds and struggled to complete finicky platform puzzles with my decidedly middle-aged hand-eye coordination I often found myself thinking “I struggle all day to be a good teacher, a good husband, a good father, a good adult, now I am spending my down time struggling to be good at a video game? Bring on the mechanics where I line up my cursor on a lizard man and click repeatedly to become and expert swordsman!”
But then, it’s exactly this rejection of linear progress that The Outer Wilds enacts at every turn, and the way in which this rejection connects deeply into the thoughtful underlying plot makes all that frustration ultimately worth it. You set out at the beginning of the game trying to solve the mystery and save the world. Of course you do, that’s what one does in a video game. The game helpfully provides clues that seem to lead in just this direction. Here you find evidence that the ancient aliens were experimenting with time travel, and here you discover that they built a sun-exploding device to try to power their search for something called “The Eye of The Universe,” a mysterious signal older that the Universe itself, which they apparently revere as a religious mystery.
Ultimately, though, these clues are all red-herrings, at least in the sense that they do not empower you to save the world. The sun-exploding device actually never worked. Instead, the alien scheme to hunt for the eye of the universe sat dormant for millenia, until now, when the universe’s time is at hand and the stars are going out. The sun is exploding of old age, and it’s triggered the eye search time loop in a cosmic mistake. As a player, the end of the game involves you discovering the means to travel to the eye, where you experience a moment of stillness in a game that has otherwise felt frantic (from the get-go the game offers you the opportunity to sit around a camp fire and just roast marshmallows, but during actual game play it felt absurd to take that opportunity because the world is gonna end in 20 minutes, who has time! At the eye, you get this opportunity again, and now, why not?) and then witness the death of your universe and the birth of a new one. The last moment has your alien astronaut floating alone in space, dying, watching some new thing explode into being.
It’s this subversion of the “save the world” trope that, for me, felt so satisfying and thought provoking. The notion of “saving” the world, setting things back just the way they were, is ultimately a conservative one. Moreover, it’s an impossible goal, at least for us mammals. Some sort of sentient, immortal bacterium might rightfully imagine the stasis of a saved world, but we can only ever accept that our world will end, and we will launch the next one on its way as best we can.
Stop trying to save this world, nurture the next one, and accept it won’t be ours. This seems a fitting message for 2019, and I was glad The Outer Wilds gave me a moment to reflect on it.
Actually, I mostly didn’t want to weigh in, given my sense that our current moment is sectarian enough that I may offend people whose opinion I respect by voicing my ideas in public, but I’ve decided to write this anyway.
One thing I notice is that when Warren backs away from or limits her commitment to the most aggressive version of left-leaning policy (as she has now has with Medicare-for-all) critics seem to read her deviation through the lens of a particular narrative about Barack Obama. In this narrative, the reason the Obama administration was unsuccessful in achieving a progressive policy agenda is that Obama was insincere about his politics in 2008. Obama, in this narrative, may have run as a leftist but was really a centrist in his heart. He failed to achieve left goals in office because he never really wanted to achieve left goals. He was really a Trojan horse for the status quo. When Warren seems less than fully committed to Medicare-for-All or the Green New Deal, these critics seem to suggest that she too is trying to fool leftist voters into supporting a centrist agenda of support for the status-quo.
For me, this narrative about the Obama administration is not very convincing. For one thing, it seems obvious to me that, while the ideal set of policies Obama might have wanted to pursue might have indeed been a bit to the right of the policies you or I might prefer, the actual outcomes of his administration were set not by those ideals, but by the massive GOP resistance he met with after the Dems lost control of the House in 2010.
More importantly, my memory of the 2008 election is that what excited us about Barack Obama was that he was running, not as a leftist, but as an authentic outsider someone untainted by the Democratic Party Establishment and in particular by association with the disastrous war in Iraq. Someone who was more genuine and less scripted than traditional politicians. Someone who could deliver “change” (to cite half his slogan) because he was not beholden to the existing establishment.
Ultimately, this “outsider” frame seems to me more harmful to the Obama administration’s ability to achieve progressive goals than any centrist ideology. Without institutional expertise to move legislation through the relatively friendly Congress of 2009-2010, the administration was left with little demonstrable “change” to excite supporters going into the 2010 election. The Tea Party wave swept in, and now here we are.
For me, the rhetoric of outsider authenticity seems to have been very much taken up by the Sanders campaign, and that’s a big reason why I prefer Warren. I think that, in a world so heavily mediated as our own, it’s understandable that we yearn for something that feels genuine and sincere. As our media environment has become more context-collapsed, we’ve become ever more aware of the shifting performances of politicians, celebrities and other public figures, which only seems to heighten that desire for the “authentic.” Ultimately, though, I’m not at all sure that that yearning can ever be fulfilled.
Furthermore, sincerity doesn’t seem to be a particularly effective method of persuasion, outside of rather narrow audiences. Expertise and savvy alliance-building, with all the slippery code-switching that might entail, seem more promising, if devilishly hard to pull of in a social-media saturated world.
If you’ve encountered me at all (online or off) in the last 5 years or so, you’ll probably have figured out that I’m a little hung-up on the failed utopian promises of what we used to call “read-write” culture. Giving everyone access to the means of information production was going to set everyone free (we confidently predicted in 2004) . Now its 2019 and Trump is president and Nazis are swarming everywhere. What gives?
One of the most informative recent scholarly work investigating these broken promises of online culture is Whitney Phillips’ This is Why We Can’t Have Nice Things, which presents a fascinating critical ethnography of the online culture of “trolling.” The troll, Phillips argues, may be presented as the reason why we can’t have the “nice thing” of a truly inclusive and democratic online culture but the truth is more complicated and implicates a much wider swath of mainstream media culture. Yes, trolls are abusers, but their abuse is formed, motivated, and structured by the larger sensationalist media culture we all exist in.
In particular, Phillips examines how trolls use over-the-top sensationalist hoaxes as an exploit to capture the attention of the larger media apparatus. In one example, they spread false accounts purporting that a (non-existent) drug called “jenkem” made from fermented human feces is becoming popular in American high schools. In response, local news outlets across the country published sincere warnings to parents instructing them to watch their children for signs they had been huffing human waste. In another, trolls submitted a hoax “confession” supposedly authored by a member of an online pedophile ring to Oprah Winfrey. The phony confession was riddled with references to online memes, which Winfrey earnestly read to her audience. As Phillips recounts, Winfrey told her audience that she had been contacted by “a member of a known pedophile network” who claimed that “his group has over 9000 penises and they’re all … raping … children” (a reference to the Dragonball-Z derived ‘Over 9000‘ meme). Oprah’s credulous recitation of mimetic catchphrases was a source of great amusement for the trolls in their den on 4chan.
This second example is, for me, particularly telling. Why go to such lengths simply to trick a well-known talk show host into reciting an obscure in-joke on the air? In part, Phillips suggests that the answer lies in the troll’s desire to take control of the larger media apparatus. This is viral media, not in the sense that it spreads from one exposed victim to the next, but in the sense that it’s a small fragment of information capturing and re-using the cellular machinery of a much larger organism for its own goals.
The desire of trolls to use this viral technique to bend large media outlets to their whims reminds me of Neil Postman’s description of the alienating effects of mass media. For Postman, the mass media of the television age collapsed distance and thus swamped viewers with information about far-away problems they had no meaningful agency to solve. He writes:
Prior to the age of telegraphy, the information-action ratio was sufficiently close so that most people had a sense of being able to control some of the contingencies in their lives. What people knew about had action-value. In the information world created by telegraphy, this sense of potency was lost, precisely because the whole world became the context for news. Everything became everyone’s business. For the first time, we were sent information which answered no question we had asked, and which, in any case, did not permit the right of reply
Read through the lens of Postman, the troll may appear then as culture-jammer, seizing the “right of reply” from the alienation of one-way media. The kind of read-write hero we all yearned for in 2004. But of course, that’s not what trolls are (or at least, not all they are). Trolls, as Phillips reminds us, taunt those grieving the recently dead and spread racist and sexist humor with glee. Trolls are no read-write hero, but a heavily dissociated subculture interested in the manipulation of hapless others for their own amusement.
Part of what might account for this, I’d like to speculate here, is the slippage between mass-media and everyday communicators in the era of social media. This slippage (which Alice Marwick describes in her review of the concept of “micro-celebrity”) encourages us to use the same detached, critical lens we developed for reading the carefully managed presences of mass-media celebrities for interacting with ordinary people in online spaces. For me, this slippage is perhaps best captured in young people’s use of the word “cancelled” to describe someone they have pointedly decided to purposefully ignore/block/mute online. We’re all TV now, and if we don’t like what’s on, we “cancel” it.
In the heyday of utopian read-write culture we hoped to turn Television into genuine communities. Maybe part of what has actually happened is that genuine communities have turned into something like television.
The NY Times has a special section today on privacy. One piece proclaims “My Phone Knows All, and That’s Great“. It’s satire, but satire so dry that in our Poe’s Law dominated age it is destined to be taken as sincere opinion over and over and over again. Indeed, it’s not so different from sincere arguments I hear from students all the time: “I wasn’t using my data, why do I care if Google vacuums it all up for ads? What bad thing is going to happen to me if I get a targeted ad, anyway?”
The honest answer to that question is “probably nothing.” Probably nothing bad will happen to you. It’s important to point out, however, this is also the honest answer to the question “what bad thing will happen if I don’t put my baby in a rear-facing car seat?” Probably nothing. Probably you will go about your day and never have a car accident and the baby will be fine. That’s usually what happens. Almost all the time. Almost.
But of course, that “almost,” that small sliver of a chance that something bad could happen (even though, at the scale of n=1, it probably, won’t) scaled to 300 million people, means thousands of children saved by rear-facing car seats. Thus we regulate, and mandate that manufacturers produce safe car seats, and parents install them. It’s an awkward, imperfect, ungainly system. It’s understandable that, as they spend their 20th awkward minute in the driveway trying to install an ungainly child safety device, many parents may briefly entertain conspiracy theories that the whole system is a profit making ploy on the part of the Gracco and Chicco corporations. Nonetheless, we do it, and it basically works.
In the same way, online surveillance is probably mostly harmless at an individual level. At a systemic level, the harms become more plausible. Some individuals may be uniquely likely to be harmed by ads that trigger traumas or prey off of vulnerabilities (think of the ads targeted at soon-to-be parents, at the sick, at the depressed). At a society-wide level, better slicker ads could further fuel the churn of over-consumption that seems to exacerbate, if not cause, so many social ills.
Of course, we’ve dealt with potential harms of ads before. At the dawn of Television, Groucho Marx would open “You Bet Your Life” with an integrated ad for Old Gold cigarettes. We eventually decided that both tobacco ads and integrated ads were bad ideas, and regulated against them (though the latter is on its way back). TV was still able to use advertising as a business model for funding a fundamentally public good (broadcast, over-the-air TV, which anyone could pick up with an antenna, an idea that seems vaguely scandalous in today’s premium-subscriptions-for-everything world). In the same way, we could put regulatory limits on what advertisers can do with our data, how they can target ads, etc. It wouldn’t kill the business model. Oh, the platforms and the ad folks will scream bloody murder about it, because their margins will suffer, but they will survive.
I, personally, would have preferred a slightly more radical option: call it the BBC model for internet content, where every purchase of an internet-connected device would pay a small fee towards a collective fund to pay internet content providers. Again, this is a reasonable adaptation of public-goods provisioning models from the broadcast age. A flawed mechanism, but one we know to work.
Fifteen years ago, there were serious proposals for such a plan, which would have avoided the era of targeted advertisers (and the surveillance system they have built) entirely. It never really got any traction, though. Instead, there was an idea in the air that the Internet Was Different. That it would be a mistake to try the techniques of the past on this radical, decentralized medium. That, rather than developing a centralized mechanism for collecting and distributing fees as a business model for content creation, it would be better to allow flexible, horizontal, associations of volunteers to build things on their own. These volunteers could build amazing things for free, just look at Wikipedia! Clay Shirky argued, that if we could just harness some of the cognitive surplus being wasted on the “heat sink” of Television we could easily build “100 Wikipedia projects per year.”
But, of course, we didn’t build hundreds or thousands of Wikipedias. We built one Wikipedia, and we built Facebook. In retrospect, the radically decentralized Utopia seems like a cover story for the advertising free-for-all we were actually building.
Our students are about to start using Machine-Learning driven semi-automated composition tools, first as illicit tools designed specifically for cheating, and then as a normal part of the everyday composition process. As writing teachers, we need to think about how we’ll respond to this.
First as plagiarism….
What makes me so sure automated composition tools are coming? A recent machine learning system for generating text developed by Open AI, called GPT2, generates uncannily good text output with minimal human intervention.
While the full version of the GPT2 language model has not been released, and presumably requires some fairly serious computer hardware, a simpler version of GPT2, called GPT2-117M is available for anyone to download and play with on github. It took me, a self-trained coder and system administrator of fairly modest skill, about an hour to get the software up and running on a very standard Windows PC. I fed this simpler version of the software the prompt for our First Year Writing program’s standard writing reflection exercise: “What is Your Definition of Good, Effective Writing?” and got output like this:
My definition of good effective writing is that it may be written with as much care as possible for an author’s pleasure, the same quality which applies to a child. I should therefore be wary of making use of this method when it should serve as a source of truth. I must, in fact, like to point out this method and take advantage of its advantages. It will be an error to say that all good works are good works, but in every case there will always be an out-of-hand way where the good works that might exist ought to be, and where the out-of-hand way is a far better choice. To make good work of a great author is the main virtue of all good works, whether it be of children or of poets or theologians, when there can be any good way from it to any good work, although in rare instances, and not only in cases where there may be a special need to find good works, there may be no other way, and there will always be a special need. In short, a good work is simply written without attention, not without effort. Nothing is more difficult to write, as one might suppose, and the whole idea of writing cannot take place without the use of written prose. If any other way were possible I might consider myself quite open to using this method. No more should people say, ‘We are going to write this book by this night! It is good for us!’ No more should anybody say, ‘It’s good for our children! It’s good for us! It should only be done by one who loves his children!’ This method will be adopted as a rule and used wherever possible.
(GPT2 – 117 Output)
The relative simplicity of setting up the available GPT2 tool, and the relatively modest computer required to run the tool, both suggest that creating an “auto-generate your assignment” website will likely crop up in the next few months. Students are likely to use such a resource, much as they already use tools like auto-paraphrasing websites.
Using auto-generated text to cheat on writing assignments is, I would argue, a symptom of larger failures in the way we teach and assess writing. As administrations pack writing classrooms with ever more students, and assign ever more contingent faculty to do the work of first-year writing, the amount of time instructors have to devote to reviewing each writing assignment dwindles. This encourages the use of automated plagiarism-detection tools, like Turnitin, which in turn legitimize the use of automated plagiarism-detection-avoidance tools, like auto-paraphrasers and now, likely, GPT2-based text generators. Students likely think, “If the instructor can’t take the time to read my assignment, why should I take the time to write it?” Machine reading begets machine writing and vice-versa, just as in the now decades-long war of spammer against spam detector.
Legitimate Cyborg Writers and Bullshit Writing Work
While automated writing tools may start out functioning as illicit plagiarism aids, they are likely to spread to the world of legitimate writing tools in short order. In many ways, automated writing is already a part of how we compose. Autocomplete in smart phone messaging apps is the most everyday form of this, and tools like Google’s email auto-response have begun to extend the role of cyborg writing in our everyday lives. It isn’t hard to imagine a new and improved form of Microsoft Word’s infamous “Clippy” tool that would allow writers to compose various genres of text by selecting the desired sort of document, entering some key facts, and then using GPT2 or a similar machine-learning driven text-generation tool to create a draft document the author could then revise (or perhaps tweak by setting further parameters for the tool, “Siri, make this blog post 16% snarkier”).
Such a cyborg writing environment may strike some as unsettling. Surely, critics might say, the process of composition is too important to our identity and sense of self to be automated away like this. I think there is some important truth in this critique, which I’ll elaborate on later, but I also think that the world is awash in what we might call (to paraphrase David Graeber) “Bullshit Writing Work.” Writing done, not because any actual audience wants to read it, but because some requirement somewhere says it simply must be done. Work emails, report summaries, box blurbs, website filler, syllabus mandatory policies, etc, etc. We’ve all at some point written something we knew no one would ever read, just because The Requirement Says There Must be text there. If automated tools can do the bullshit writing work, we should let them. There is no implicit honor in drudgery.
I know that the practice of teaching writing has wrestled for a long time with the problem of bullshit writing assignments, and that many people have done a lot of thinking about how to make student writing feel like something composed with a real purpose and audience in mind, rather than something that simply Must Be Done Because Syllabus. I also know that, in my own experience as a teacher, I often struggle to put these ideas into practice successfully. Too often I find that I try to assign something I mean to be Real Writing (here’s a scenario, now compose a blog post, a grant proposal, a tweet!) that ends up feeling to students like writing they must do for class, and then also pretend that they are doing it for some other reason. Bullshit on bullshit.
I can’t help but wonder if, as we think about the imminent arrival of even-more-automated cyborg writing tools than the ones we already have, we might use this as an opportunity to think about how and why writing matters. In short, as machines begin to take an ever-increasing role in creating the products of writing, I wonder if we could redouble our efforts to help students understand the importance of the process of writing. In particular, I think we need to focus on the value the writing process has in and of itself, and not as a means for creating a written product. In other words, we might:
Explicitly emphasize writing-to-think and writing-to-learn. Writing is a process, first and foremost, of composing the self (a lesson I learned from Alexander Reid’s The Two Virtuals). Even as “compositions” become automated, the process of self-composition remains something we do in and through written symbols, and keeping those symbols close to the self, in plain text rather than in the black boxes of machine-learning algorithms, remains a powerful tool for thought.
Spend even more time working on pre-writing, planning, outlining, note-taking. Often times, students are simply told to do this work, with the expectation that they can figure that out on their own, and that they will really need our help when they get to the drafting and revising stage.
Embrace cyborg text, and allow it into our classrooms. This doesn’t mean we should abandon writing exercises that might help students build hands-on experience with text. Such exercises will help them build important instincts that will continue to serve them well. But it does mean we should consider teaching how to plan for, engage with, and revise the products of machine-assisted writing as it enters the mainstream.
The ultimate effects of semi-automated writing are far more profound than these few pragmatic steps. Still, these are some ways we might adjust our classrooms in the near term, as we continue to wrestle with larger shifts.
A brief timeline of some important events in the history of peer
production on the web (sort of, really the larger 21st century web),
just so I can keep the chronology straight for myself. I’ve assembled
this as part of prep for an article on the history of Wikipedia, so
events I think of as connected to Wikipedia’s emergence are privileged.
This is a note-to-self sort of thing. I constructed it
idiosyncratically, remembering things that seemed important at the time
and snowballing from there. It’s not meant to be exhaustive or
Spring 1985: The WELL founded
October 1985: Free Software Foundation Formed
August 1998: IRC Created
February 1989: GNU GPL Version 1 released
April 1989: MP3 Patented
July 1990: Electronic Frontier Foundation formed
January 1991: First Web Servers Available
September 1991: First Linux Kernel Available
September 1993: Release of NCSA Mosaic Browser / AOL adds USENET (“Endless September”)
January 1994: Yahoo! Founded
July 1994: Amazon Founded / WIPO Green Paper on IP (DMCA groundwork)
September 1994: W3C Formed
November 1994: Beta releases of Netscape Available / Geocities Founded as “Beverly Hills Internet”
March 1995: Ward Cunningham releases first wiki software
April 1995: First Apache Webserver Release (0.6.2)
July 1995: Geocities Expands “Cities” Available for Users
August 1995: Netscape IPO / Internet Explorer 1.0 released
December 1995: Altavista search engine launches
February 1996: Communications decency act passes / “Declaration of Independence of Cyberspace” published
December 1996: Flash 1.0 released
May 1997: Amazon IPO
September 1997: Slashdot begins
October 1997: Explorer 4.0 (version that will take majority market share from Netscape) released
December 1997: RSS Created / Term “Weblog” Coined
April 1998: BoingBoing.net at current web address (sources say it began 1995)
May 1998: Microsoft anti-trust case (Browser bundling) begins
August 1998: Pets.com Founded/Geocities IPO/Blogger launched
September 1998: Google Founded
November 1998: Netscape releases source code for communicator
January 1999: Yahoo! buys Geocities
June 1999: Napster service begins
November 1999:Code and Other Laws of Cyberspace published
December 1999: Lawsuit against Napster begins
January 2000: 16 Dot-com companies run superbowl commercials / AOL-Time Warner Merger Announced
March 2000: Nupedia goes online
March 2000: NASDAQ Peaks and begins to decline / Gnutella released
November 2000: Pets.com defunct
January 2001: Wikipedia goes online / Creative Commons Launched
February 2001: Peak Napster Users
July 2001: Napster Shut Down
September 2001: Moveable Type Blog Software announced
August 2002: “Coase’s Penguin” published
March 2003: Friendster goes online
May 2003: WordPress released
June 2003: First “Flash Mob”
August 2003: Myspace Launched
February 2004: Flickr launched / Facebook Launched
May 2004:Anarchist in the library published
October 2004: First Web 2.0 Summit
November 2004: Digg Launched
February 2005: YouTube Launched
June 2005: Reddit Launched
March 2006: English wikipedia has 1 million articles
April 2006:Wealth of Networks published
May 2006: “Digital Maoism” published
June 2006: Term “crowdsourcing” coined / Myspace Overtakes Google as most visited site
January 2007: Wikipedia’s editor population peaks and begins to decline (largely unacknowleged until 2009 or so)
September 2007: English Wikipedia has 2 million articles
February 2008:Here Comes Everybody Published
April 2008: Facebook overtakes Myspace as most visited social networking site
October 2010: Limewire shuts down
August 2015: Facebook reports one billion uses in a single day