about
syllabus
All example source code
One of the first things we’ll explore in this class is git, github, and github pages. By no means, is github pages required for hosting your projects. But it’s free and fast and lives on github. So there are many reasons why you might like to consider it, at least during the homework / experimentation stage / of a project. Here are some resources for getting started.
The core language for this class is JavaScript. If JavaScript is new to you, I would suggest starting with the the first four chapters of Eloquent JavaScript. And there are many more resources for learning online as well.
The framework we’ll be exploring is p5.js. Here are some links to get your started.
The class will assume basic knowledge of HTML and CSS. Of course I’m happy to answer questions and go over these topics, but they won’t be explicitly covered in class, other than in the context of JavaScript DOM manipulation. For review, I would suggest reading the p5.js HTML and CSS overview. The HTML and CSS book is also excellent.
The first JavaScript task I’ll demonstrate is “DOM manipulation”, i.e. using programming to change the HTML and CSS of a page on the fly. This is something possible with native JavaScript, as well as many different JS frameworks and libraries, the most well-known probably being JQuery. In class, we’ll use the p5.js DOM library primarily, as well as dig into native JS and other frameworks when necessary.
The key functions and topics I will discuss in class are:
createElement()
, createP()
, createDiv()
, createImg()
.parent()
, child()
createButton()
, createSlider()
, createInput()
, createSelect()
, createCheckbox()
.mousePressed()
, mouseOver()
, mouseOut()
, changed()
, and more? What about addEventListener()
?select()
and selectAll()
, id vs class vs tagstyle()
(and when to use a CSS file).style()
, source codeJavaScript objects will be key to just about everything we do this semester. We’ll be using objects for DOM manipulation like p5.Element
and the native JS Element
. We’ll be exploring data coming in as JSON (“JavaScript Object Notation”). We’ll focus a lot on the JS String
object as well as objects from other libraries. And we’ll write our own objects for analyzing and generating text. The examples will use ES6 classes.
The whole point of this week is to work with programmatic text mashups in the browser. Take a look at the William Burroughs’ Cut-Up example. We’ll need two essential skills here — how to load text (from a file or user) and how to work with JavaScript Strings. Let’s first start with loading text from a file. The simplest way to do this in p5 is with loadStrings()
. It loads a local file (accessed by its relative path to the html file). The simplest way to get the data is to use preload()
which guarantees that the data is read before setup()
triggers.
Note the naming of the variable lines
. One of the odd nuances of loadStrings()
is that it loads all the text into into an array, with each “line” of text as a separate element of an array, i.e. the text file:
comes in as:
This is convenient in many cases, but for us right now, we just want all the text as one giant string. Therefore, we can use the join()
function to put it back together. This can’t happen until setup()
however, since the data isn’t guaranteed to be loaded until then.
While using preload()
is a nice, quick and easy trick to getting text in, it’s not how you might typically see code in JavaScript working. The flow of a program in JavaScript is usually a sequence of events and functions can act as blocks of code triggered when certain events occur. These functions are known as “callbacks”, they are “called back” when the event is triggered. You’ve likely seen this with the p5.js dom functions mousePressed()
, mouseOver()
, mouseOut()
, etc.
Loading data works in a similar way. There is a moment where the code asks to load the file, and then an event that follows later when the data is actually loaded (there could also be an “error” event if there is a problem with the file.) loadStrings()
works exactly this way when you pass it two arguments — the name of the file, and a function that will executed when the data from the file is ready (the callback).
The use of a callback is very typical of JavaScript, and we’ll be seeing many examples of this over the course of the semester. It’s also possible to write an “anonymous” function directly as an argument to loadStrings()
but this will make the code a bit harder to follow. Let’s take a look at the fileready()
function.
The function takes a single argument: lines. lines containa all of the text from the file in a array of strings (unless there was an error).
The next way of getting text in I want to examine is with a user-selected file. This can be accomplished one of two ways, a select file button (as below) or a “drop zone” (an area in the page that a user can drag and drop a file.)
The choose file button can be generated fairly easily with p5.js using createFileInput()
. createFileInput()
requires only a single argument, a callback for when the file(s) are loaded. A second argument 'multiple'
is optional if you want to allow the user to select multiple files. In the case of multiple files, the callback is triggered once for each file.
The argument passed to the gotFile()
callback is a p5.File
object. It contains metadata about the file such as its name, type, and size, as well as the actual contents of the file, its “data.” All of these are available as properties of the p5.File
object and accessible with dot syntax. The following more fleshed out version of the callback creates DOM elements displaying the metadata and contents. Note how a different action can be performed depending on the file’s type.
Another, often more convenient, way to accept files from a user is to allow the user to “drag and drop” files in the page itself. To do this, you first need to create and style a div that will act as the “drop zone”. For example:
There’s nothing particularly special about the CSS for the above drop zone, just some padding and a dotted line.
This could all be generated in p5 using createDiv()
but it’s nice to also see scenarios where the HTML is “hard-coded” and then accessed with JavaScript. Since the div has an id drop_zone
, the p5 select()
function can be used to grab the DOM element.
Note the use of the hash sign #
to indicate DOM element id. Once you have a DOM element to act as a drop zone there are three events you can handle — dragOver()
, dragLeave()
, and drop()
. dragOver()
and dragLeave()
are just like mouseOver()
and mouseOut()
, only instead of just hovering over the element, the events are triggered only if the user is dragging a file over the element. This can be useful for giving the user some feedback as to what is going on:
The event we care most about is drop()
. This event requires two callbacks — an event to handle the moment the user drops the file(s), and a callback that is triggered when each file is loaded and ready to be accessed. In the code below, the arguments are in the reverse order, first is the callback for handling the files, and second is the callback for the moment of drop.
Note how I am re-using the exact same gotFile()
function that we had with the “choose files” button.
If you want to get a large body of text typed in by a user, createInput()
isn’t a great choice. It’s meant more for just a couple words or a single sentence:
Type a sentence:
For a larger body of text the <textarea>
element can be generated using createElement()
.
The p5 size()
function can be used to adjust the areas default size.
It’s also possible to simply use a div
or p
element and assign the attribute contenteditable
. This makes any DOM element editable by the user (and you can the capture the content of that element with the html()
function.) For example:
Note how you can edit this text below:
this will be editable
As always, these elements can also be written into the HTML directly and accessed in p5 with select()
and selectAll()
.
One you have data from a file or user input (or user file input!), the next step is to do something interesting with it. For this week, it’s up to you to invent something. To demonstrate some possibilities I’ll first run through some of the basic functions available as part of the JavaScript String object and then describe one scenario for analyzing text — the Flesch index.
I should note that almost everything I am doing this week could be improved or expanded with regular expressions, but I am explicitly saving that as a topic for next week.
A String, at its core, is really just a fancy way of storing an array of characters. With the String object, we might find ourselves writing code like.
Interestingly enough, there is no distinction between an individual character or a string in JavaScript. Both of the variables below are storing the same datatype.
In JavaScript, strings can be literal primitives or objects.
For the most part, this is a distinction you don’t have to worry about. JavaScript will automatically convert a primitive String into an object when necessary. In general, however, it’s good practice to initialize strings as primitives to increase performance.
JavaScript provides us with a basic set of strings functions that allow for simple manipulation and analysis (again, leaving out regular expression for now!) All of the availabe String properties and functions are laid out in the JavaScript reference, and I’ll explore a few useful ones here. Let’s take a closer look at three: indexOf()
, substring()
, and the length
property.
indexOf()
locates a sequence of characters within a string. For example, run this code and examine the result:
Note that indexOf()
returns a 0
for the first character, and a -1
if the search phrase is not part of the String.
After you find a certain search phrase within a string, you might want to pull out part of the string and save it in a different variable. This is called a “substring” and you can use java’s substring()
function to take care of this task. Examine and run the following code:
Note that the substring begins at the specified beginning index (the first argument) and extends to the character at the end index (the second argument) minus one. Thus the length of the substring is end index minus beginning index.
At any given point, you might also want to access the length of the string. This is accomplished this with the length property.
It’s also important to note that you can concatenate (i.e. join) a string together using the +
operator. With numbers plus means add, with strings (or characters), it means concatenate, i.e.
One string-related function that will prove very useful in our text analysis programs is split(). split()
separates a group of strings embedded into a longer string into an array of strings.
Now the built-in split()
native to JavaScript uses regular expressions, so this week I’ll demonstrate splitting with the p5.js functions split()
and splitTokens()
.
Examine the following code:
To perform the reverse of split, the p5 function join()
is used.
I’ll end this week by looking at a basic example of text analysis. I’ll read in a file, examine some of its statistical properties, and display a report. The example will compute the Flesch Index (aka Flesch-Kincaid Reading Ease test), a numeric score that indicates the readability of a text. The lower the score, the more difficult the text. The higher, the easier. For example, texts with a score of 90-100 are, say, around the 5th grade level, wheras 0-30 would be for “college graduates”.
The Flesch Index is computed as a function of total words, total sentences, and total syllables. It was developed by Dr. Rudolf Flesch and modified by J. P. Kincaid (thus the joint name). Most word processing programs will compute the Flesch Index for you, which provides us with a nice method to check our results.
Flesch Index = 206.835 – 1.015 * (words / sentences) + 84.6 * (syllables / words)
The pseudo-code looks something like this:
The examples above on this page demonstrate how to read in text from a file and store it in a String object. Now, all I have to do is examine that string, count the total words, sentences, and syllables, and apply the formula as a final step.
The first thing I’ll do is count the number of words in the text. We’ve seen in some of the examples above that we can accomplish this by using split()
to split a String up into an array wherever there is a space. For this example, however, we are going to want to split by more than a space. A new word occurs whenever there is a space or some sort of punctuation.
Note again how splitTokens()
will split using any of the listed characters as a delimiter. Next week, I will cover how to use regular expressions to split text.
Now that I have split up the text, I can march through all the words (tokens) and count their syllables.
Ok, so countSyllables()
isn’t a function that exists in JavaScript. I’m going to have to write it myself. The following method is not the most accurate way to count syllables, but it will do for now.
Syllables = total # of vowels in a word
(not counting vowels that appear after another vowel and when ‘e’ is found at the end of the word)
The code looks like this:
Again as you will see next week, the above could be vastly improved using Regular Expressions, but it’s nice as an exercise to learn how to do all the string manipulation manually before you move on to more advanced techniques.
Counting sentences is a bit simpler. I’ll just split the content using periods, question marks, exclamation points, etc. (“.:;?!”) as delimiters and count the total number of elements in the resulting array. This isn’t terribly accurate; for example, “My e-mail address is daniel.shiffman@nyu.edu.” will be counted as three sentences. Nevertheless, as a first pass, this will do.
Now, all we need to do is apply the formula, generate a report as a string (which can be inserted into a DOM element using html()
.
In class, we’ll do an exercise around mashing up text manually. Here are links to further reading and information about the techniques we discussed, as well as online versions of the algorithms. For your homework you can choose to work with one of these methods manually or programmatically.