Subtitles section Play video
-
Welcome back for Unit 3.
-
This unit introduces the next big idea we need for a web crawler,
-
which is structured data.
-
And by the end of this unit you will have finished building a working web crawler.
-
The closest thing we've seen so far to structured data
-
is the string data type introduced in Unit 1 and used
-
in many of the procedures in Unit 2.
-
A string is a kind of structured data, and that's because
-
we can break it down into its characters.
-
The string has a sequence of characters,
-
and we can operate on sub sequences of the string.
-
What we could do with strings was somewhat limited, though,
-
because the only thing we can put in a string is a character.
-
Today, we're going to introduce the list data type,
-
and lists are much more powerful than strings,
-
so whereas for a string, all of the elements had to be characters,
-
in a list, the elements can be anything we want.
-
They could be characters. They could be strings.
-
They could be numbers. They could also be other lists.
-
Let's look at an example.
-
When we created a string, we just put a sequence of characters
-
surrounded by either single or double quotes.
-
Here's an example of a string,
-
and we could store that string in a variable by using an assignment.
-
With a list, instead of using quotes to identify the list
-
we use square brackets, and the elements are separated by commas.
-
And just like with a string, we can assign the list that we created
-
to a variable, so we'll store that list in the variable "p."
-
With a string, we could use the square brackets
-
to select elements, and when we index element 0,
-
we'll get the first element of the string, a sequence of that character,
-
which is the character "y."
-
With lists, we can also use square brackets to access elements,
-
so if we do p[0],
-
that will evaluate to the first element of p,
-
which is the string containing the single letter y.
-
With strings, we saw that we could use the colon inside the square brackets
-
to select a sub string of more than 1 character.
-
Here we're selecting from position 2 through position 4.
-
That will give us the third and fourth characters of the string,
-
which is the sub sequence, the string "bb."
-
We can do the same thing with lists.
-
We can select from position 2 to position 4,
-
but instead of returning a string, it will return a list
-
containing those elements.
-
It will give us a list of the third and fourth element
-
of the variable p, which is the list that we have here.
-
The general grammar for constructing a list
-
is to have a square bracket followed by a list
-
of any number of expressions where the expressions
-
are separated by commas.
-
We could create a list using just 2 brackets,
-
a left bracket and a right bracket, and this would create a list
-
containing 0 elements, also known as the empty list.
-
We could create a list containing 1 element.
-
That would be the square brackets with 1 element between them.
-
Here we've created a list containing just 1 element,
-
which is the number 3.
-
Or we could create a list with many elements, as we did in the first example,
-
where we have all of the strings separated by commas.