Understanding Python syntax in lists vs seriesHow do I check if a list is empty?Calling an external command...
How to not get blinded by an attack at dawn
Does it matter what way the tires go if no directional arrow?
How did the horses get to space?
Does the Rogue's Reliable Talent feature work for thieves' tools, since the rogue is proficient in them?
Cuban Primes
How will the lack of ground stations affect navigation?
Is there any deeper thematic meaning to the white horse that Arya finds in The Bells (S08E05)?
Why is the Advance Variation considered strong vs the Caro-Kann but not vs the Scandinavian?
Will the volt, ampere, ohm or other electrical units change on May 20th, 2019?
Is there any good reason to write "it is easy to see"?
How to check if comma list is empty?
How to rename multiple files in a directory at the same time
UUID type for NEWID()
Can anyone give me examples of the relative-determinative 'which'?
I recently started my machine learning PhD and I have absolutely no idea what I'm doing
What is the effect of the Feeblemind spell on Ability Score Improvements?
Why doesn't Iron Man's action affect this person in Endgame?
Why are goodwill impairments on the statement of cash-flows of GE?
What metal is most suitable for a ladder submerged in an underground water tank?
Can I say: "When was your train leaving?" if the train leaves in the future?
What was Varys trying to do at the beginning of S08E05?
c++ conditional uni-directional iterator
Is the seat-belt sign activation when a pilot goes to the lavatory standard procedure?
What do you call the hair or body hair you trim off your body?
Understanding Python syntax in lists vs series
How do I check if a list is empty?Calling an external command in PythonWhat are metaclasses in Python?Finding the index of an item given a list containing it in PythonDoes Python have a ternary conditional operator?Understanding slice notationUnderstanding Python super() with __init__() methodsHow to make a flat list out of list of listsHow do I list all files of a directory?Does Python have a string 'contains' substring method?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I am new to Python (no computer science background) for data science. I keep hearing that Python is easy, but I am making incremental progress. As an example, I understand:
len(titles[(titles.year >= 1950) & (titles.year <=1959)])
"In the titles dataframe, create a series and take from the year column of the titles dataframe anything greater than or equal to 1950 AND anything less than or equal to 1959. The take the length of it."
But when I encounter the following, I don't understand the logic of:
t = titles
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')
or
titles.title.value_counts().head(10)
In both these cases, I can piece it together obviously.
But it is not clear. In the second, why does
Python not allow me to use square brackets and regular brackets like in the first example?
python pandas syntax
|
show 2 more comments
I am new to Python (no computer science background) for data science. I keep hearing that Python is easy, but I am making incremental progress. As an example, I understand:
len(titles[(titles.year >= 1950) & (titles.year <=1959)])
"In the titles dataframe, create a series and take from the year column of the titles dataframe anything greater than or equal to 1950 AND anything less than or equal to 1959. The take the length of it."
But when I encounter the following, I don't understand the logic of:
t = titles
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')
or
titles.title.value_counts().head(10)
In both these cases, I can piece it together obviously.
But it is not clear. In the second, why does
Python not allow me to use square brackets and regular brackets like in the first example?
python pandas syntax
Off hand, this doesn't appear to be just "vanilla" python. Do you have any libraries you are using? (numpy, scipy, anaconda, etc.) If you had to run a "pip" command, that installs libraries. It would be helpful to note / tag what libraries you are using.
– Mark Ribau
3 hours ago
2
@MarkRibau Looks likepandas
.
– gmds
3 hours ago
1
Judging from the word "dataframes," pandas is right.
– kindall
3 hours ago
1
Where would you expect to use square brackets in your other examples?
– Code-Apprentice
3 hours ago
You could use the brackets ont.year
as well, you just dont. I'm not sure I understand your confusion, exactly, can you elaborate?
– juanpa.arrivillaga
3 hours ago
|
show 2 more comments
I am new to Python (no computer science background) for data science. I keep hearing that Python is easy, but I am making incremental progress. As an example, I understand:
len(titles[(titles.year >= 1950) & (titles.year <=1959)])
"In the titles dataframe, create a series and take from the year column of the titles dataframe anything greater than or equal to 1950 AND anything less than or equal to 1959. The take the length of it."
But when I encounter the following, I don't understand the logic of:
t = titles
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')
or
titles.title.value_counts().head(10)
In both these cases, I can piece it together obviously.
But it is not clear. In the second, why does
Python not allow me to use square brackets and regular brackets like in the first example?
python pandas syntax
I am new to Python (no computer science background) for data science. I keep hearing that Python is easy, but I am making incremental progress. As an example, I understand:
len(titles[(titles.year >= 1950) & (titles.year <=1959)])
"In the titles dataframe, create a series and take from the year column of the titles dataframe anything greater than or equal to 1950 AND anything less than or equal to 1959. The take the length of it."
But when I encounter the following, I don't understand the logic of:
t = titles
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')
or
titles.title.value_counts().head(10)
In both these cases, I can piece it together obviously.
But it is not clear. In the second, why does
Python not allow me to use square brackets and regular brackets like in the first example?
python pandas syntax
python pandas syntax
edited 3 hours ago
kindall
132k19199253
132k19199253
asked 3 hours ago
DataNoob7DataNoob7
435
435
Off hand, this doesn't appear to be just "vanilla" python. Do you have any libraries you are using? (numpy, scipy, anaconda, etc.) If you had to run a "pip" command, that installs libraries. It would be helpful to note / tag what libraries you are using.
– Mark Ribau
3 hours ago
2
@MarkRibau Looks likepandas
.
– gmds
3 hours ago
1
Judging from the word "dataframes," pandas is right.
– kindall
3 hours ago
1
Where would you expect to use square brackets in your other examples?
– Code-Apprentice
3 hours ago
You could use the brackets ont.year
as well, you just dont. I'm not sure I understand your confusion, exactly, can you elaborate?
– juanpa.arrivillaga
3 hours ago
|
show 2 more comments
Off hand, this doesn't appear to be just "vanilla" python. Do you have any libraries you are using? (numpy, scipy, anaconda, etc.) If you had to run a "pip" command, that installs libraries. It would be helpful to note / tag what libraries you are using.
– Mark Ribau
3 hours ago
2
@MarkRibau Looks likepandas
.
– gmds
3 hours ago
1
Judging from the word "dataframes," pandas is right.
– kindall
3 hours ago
1
Where would you expect to use square brackets in your other examples?
– Code-Apprentice
3 hours ago
You could use the brackets ont.year
as well, you just dont. I'm not sure I understand your confusion, exactly, can you elaborate?
– juanpa.arrivillaga
3 hours ago
Off hand, this doesn't appear to be just "vanilla" python. Do you have any libraries you are using? (numpy, scipy, anaconda, etc.) If you had to run a "pip" command, that installs libraries. It would be helpful to note / tag what libraries you are using.
– Mark Ribau
3 hours ago
Off hand, this doesn't appear to be just "vanilla" python. Do you have any libraries you are using? (numpy, scipy, anaconda, etc.) If you had to run a "pip" command, that installs libraries. It would be helpful to note / tag what libraries you are using.
– Mark Ribau
3 hours ago
2
2
@MarkRibau Looks like
pandas
.– gmds
3 hours ago
@MarkRibau Looks like
pandas
.– gmds
3 hours ago
1
1
Judging from the word "dataframes," pandas is right.
– kindall
3 hours ago
Judging from the word "dataframes," pandas is right.
– kindall
3 hours ago
1
1
Where would you expect to use square brackets in your other examples?
– Code-Apprentice
3 hours ago
Where would you expect to use square brackets in your other examples?
– Code-Apprentice
3 hours ago
You could use the brackets on
t.year
as well, you just dont. I'm not sure I understand your confusion, exactly, can you elaborate?– juanpa.arrivillaga
3 hours ago
You could use the brackets on
t.year
as well, you just dont. I'm not sure I understand your confusion, exactly, can you elaborate?– juanpa.arrivillaga
3 hours ago
|
show 2 more comments
3 Answers
3
active
oldest
votes
This is not about lists
vs pd.Series
, but rather about the function of parentheses (()
) vs brackets ([]
) in Python.
Parentheses are used in two main cases: to modify the order of precedence of operations, and to delimit arguments when calling functions.
The difference between 1 + 2 * 3
and (1 + 2) * 3
is obvious, and if you want to pass a
and b
to a function f
, f a b
will not work, unlike in, say, Haskell.
We are concerned mostly with the first use here; for example, in this line:
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')
Without the parentheses, you would be calling that chain of methods on 10
, which wouldn't make sense. Clearly, you want to call them on the result of the parenthesised expression.
Now, in mathematics, brackets can also be used to denote precedence, in conjunction with parentheses, in a case where multiple nested parentheses would be confusing. For example, the two may be equivalent in mathematics:
[(1 + 2) * 3] ** 4
((1 + 2) * 3) ** 4
However, that is not the case in Python: ((1 + 2) * 3) ** 4
can be evaluated, whereas [(1 + 2) * 3] ** 4
is a TypeError
, since the part within brackets resolves to a list
, and you can't perform exponentiation on lists
.
Rather, what happens in something like titles[titles.year >= 1950]
is not directly relevant to precedence (though of course anything outside the brackets will not be part of the inner expression).
Instead, the brackets represent indexing; in some way, the value of titles.year >= 1950
is used to get elements from titles
(this is done using overloading of the __getitem__
dunder method).
The exact nature of this indexing may differ; lists
take integers, dicts
take any hashable object and pd.Series
take, among other things, boolean pd.Series
(that is what is happening here), but they ultimately represent some way to subset the indexed object.
Semantically, therefore, we can see that brackets mean something different from parentheses, and are not interchangeable.
For completeness, using brackets as opposed to parentheses has one tangible benefit: it permits reassignment, because it automatically delegates to either __setitem__
or __getitem__
, depending on whether assignment is being performed.
Therefore, you could do something like titles[titles.year >= 1950] = 'Nothing'
if you wanted. However, in all cases, titles(titles.year >= 1950) = 'Nothing'
delegates to __call_
, and therefore will fail in the following way:
SyntaxError: can't assign to function call
1
"Now, in mathematics, brackets can also be used to denote precedence, in conjunction with parentheses, in a case where multiple nested parentheses would be confusing." This might hit on the main confusion if the OP is familiar with this usage in algebra.
– Code-Apprentice
3 hours ago
Wow, thank you for the fulsome response. Yes, while my background in algebra is small, that has definitely caused me some confusion learning Python. This is the point - [] vs (). In particular, in Pandas. I am trying to be able to understand the logic, (hence my internal mentalese). So in Pandas, are [] for indexing and not making a series?
– DataNoob7
2 hours ago
@DataNoob7 Remember that indexing creates some form of subset! So, if you have aSeries
, you can index it to get anotherSeries
. If you mean from raw data, then that's a function call - something likepd.Series(data)
.
– gmds
2 hours ago
@gmds Stupid question - How is indexing defined in computer science. To me, the word "indexing" is just creating a "table of contents" to identify different parts of something. But you haven't actually extracted anything yet. So your phrase "indexing creates some form of subset" throws me off. To me you are just assigning a number, a letter, etc to your dataframe so you have not taken a subset of anything yet?
– DataNoob7
2 hours ago
@DataNoob7 In this context, "indexing" is something you do to a collection, which is an object that contains a number (including one or zero) objects. Examples of collections arelists
,tuples
,dicts
andpd.Series
. Indexing basically tells the collection to return some subset of the objects it contains, based on the arguments passed to the indexing function. For instance, forlists
, you pass integer indices and get the elements at those positions.
– gmds
2 mins ago
add a comment |
Square brackets are used for indexes on lists and dictionaries (and things that act like these). On the other hand, parentheses are used for a variety of reasons. In this case, they are used for grouping in (t.year // 10 * 10)
or as a function call in value_counts()
and other places.
In the case of a library like pandas, whether you use indexing notation with []
or a function call is entirely determined by the implementation of the library. You can learn these details through tutorials and the library's documentation.
Before digging deeper into the pandas library, I suggest that you study the basics of Python syntax. The official tutorial is a good place to start.
On a side note, when you write code, do not make each line as complex as what you see in these examples. You should instead break things into smaller pieces and assign intermediate parts to variables. For example, you can take
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')
and turn it into
decade = (t.year // 10 * 10)
counts = decated.value_counts()
sorted = counts.sort_index()
sorted.plot(kind='bar')
1
I have to agree. If just starting out (especially if you're new to programming in general), start with python basics before jumping into the data science part w/ pandas, numpy, etc.
– Mark Ribau
3 hours ago
@anyone I knowsorted
is a bad name here because of the builtin function. If anyone has a better suggestion, feel free to edit.
– Code-Apprentice
3 hours ago
What aboutsorted_counts
, or, to be more specific (and, unfortunately, verbose),index_sorted_counts
?
– gmds
3 hours ago
I have started "Python basics" through other means, but I will check out the official tutorial. I have a lot of data that I need to analyze for work(many datasets ranging from 700k to 40 million), so I need to accelerate this though. The frustrating part is that I know what I have to, but it is translating it into Python code that is very difficult. I understood [] to denote a series in pandas, but it is also indexing? Indexing outside of pandas? To the point of code length - maybe that is the issue.
– DataNoob7
2 hours ago
add a comment |
t = titles
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')
titles
is a data frame. year
is a column in that frame. In order, the operations are
- Divide the year by 10 (integer division) and multiply by 10. This truncates the last digit to 0, so that each year is the beginning of its decade. The result of this is another column, the same length as the original.
- Count the values; this will produce a new table with an entry (year, frequency) for each decade-year.
- Sort this table by the default index
- Make a bar plot of the result.
Does that get you going?
Thanks - I can understand what the code means, it's understanding the logic of why the code is written as such (see my original post).
– DataNoob7
2 hours ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f56140150%2funderstanding-python-syntax-in-lists-vs-series%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
This is not about lists
vs pd.Series
, but rather about the function of parentheses (()
) vs brackets ([]
) in Python.
Parentheses are used in two main cases: to modify the order of precedence of operations, and to delimit arguments when calling functions.
The difference between 1 + 2 * 3
and (1 + 2) * 3
is obvious, and if you want to pass a
and b
to a function f
, f a b
will not work, unlike in, say, Haskell.
We are concerned mostly with the first use here; for example, in this line:
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')
Without the parentheses, you would be calling that chain of methods on 10
, which wouldn't make sense. Clearly, you want to call them on the result of the parenthesised expression.
Now, in mathematics, brackets can also be used to denote precedence, in conjunction with parentheses, in a case where multiple nested parentheses would be confusing. For example, the two may be equivalent in mathematics:
[(1 + 2) * 3] ** 4
((1 + 2) * 3) ** 4
However, that is not the case in Python: ((1 + 2) * 3) ** 4
can be evaluated, whereas [(1 + 2) * 3] ** 4
is a TypeError
, since the part within brackets resolves to a list
, and you can't perform exponentiation on lists
.
Rather, what happens in something like titles[titles.year >= 1950]
is not directly relevant to precedence (though of course anything outside the brackets will not be part of the inner expression).
Instead, the brackets represent indexing; in some way, the value of titles.year >= 1950
is used to get elements from titles
(this is done using overloading of the __getitem__
dunder method).
The exact nature of this indexing may differ; lists
take integers, dicts
take any hashable object and pd.Series
take, among other things, boolean pd.Series
(that is what is happening here), but they ultimately represent some way to subset the indexed object.
Semantically, therefore, we can see that brackets mean something different from parentheses, and are not interchangeable.
For completeness, using brackets as opposed to parentheses has one tangible benefit: it permits reassignment, because it automatically delegates to either __setitem__
or __getitem__
, depending on whether assignment is being performed.
Therefore, you could do something like titles[titles.year >= 1950] = 'Nothing'
if you wanted. However, in all cases, titles(titles.year >= 1950) = 'Nothing'
delegates to __call_
, and therefore will fail in the following way:
SyntaxError: can't assign to function call
1
"Now, in mathematics, brackets can also be used to denote precedence, in conjunction with parentheses, in a case where multiple nested parentheses would be confusing." This might hit on the main confusion if the OP is familiar with this usage in algebra.
– Code-Apprentice
3 hours ago
Wow, thank you for the fulsome response. Yes, while my background in algebra is small, that has definitely caused me some confusion learning Python. This is the point - [] vs (). In particular, in Pandas. I am trying to be able to understand the logic, (hence my internal mentalese). So in Pandas, are [] for indexing and not making a series?
– DataNoob7
2 hours ago
@DataNoob7 Remember that indexing creates some form of subset! So, if you have aSeries
, you can index it to get anotherSeries
. If you mean from raw data, then that's a function call - something likepd.Series(data)
.
– gmds
2 hours ago
@gmds Stupid question - How is indexing defined in computer science. To me, the word "indexing" is just creating a "table of contents" to identify different parts of something. But you haven't actually extracted anything yet. So your phrase "indexing creates some form of subset" throws me off. To me you are just assigning a number, a letter, etc to your dataframe so you have not taken a subset of anything yet?
– DataNoob7
2 hours ago
@DataNoob7 In this context, "indexing" is something you do to a collection, which is an object that contains a number (including one or zero) objects. Examples of collections arelists
,tuples
,dicts
andpd.Series
. Indexing basically tells the collection to return some subset of the objects it contains, based on the arguments passed to the indexing function. For instance, forlists
, you pass integer indices and get the elements at those positions.
– gmds
2 mins ago
add a comment |
This is not about lists
vs pd.Series
, but rather about the function of parentheses (()
) vs brackets ([]
) in Python.
Parentheses are used in two main cases: to modify the order of precedence of operations, and to delimit arguments when calling functions.
The difference between 1 + 2 * 3
and (1 + 2) * 3
is obvious, and if you want to pass a
and b
to a function f
, f a b
will not work, unlike in, say, Haskell.
We are concerned mostly with the first use here; for example, in this line:
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')
Without the parentheses, you would be calling that chain of methods on 10
, which wouldn't make sense. Clearly, you want to call them on the result of the parenthesised expression.
Now, in mathematics, brackets can also be used to denote precedence, in conjunction with parentheses, in a case where multiple nested parentheses would be confusing. For example, the two may be equivalent in mathematics:
[(1 + 2) * 3] ** 4
((1 + 2) * 3) ** 4
However, that is not the case in Python: ((1 + 2) * 3) ** 4
can be evaluated, whereas [(1 + 2) * 3] ** 4
is a TypeError
, since the part within brackets resolves to a list
, and you can't perform exponentiation on lists
.
Rather, what happens in something like titles[titles.year >= 1950]
is not directly relevant to precedence (though of course anything outside the brackets will not be part of the inner expression).
Instead, the brackets represent indexing; in some way, the value of titles.year >= 1950
is used to get elements from titles
(this is done using overloading of the __getitem__
dunder method).
The exact nature of this indexing may differ; lists
take integers, dicts
take any hashable object and pd.Series
take, among other things, boolean pd.Series
(that is what is happening here), but they ultimately represent some way to subset the indexed object.
Semantically, therefore, we can see that brackets mean something different from parentheses, and are not interchangeable.
For completeness, using brackets as opposed to parentheses has one tangible benefit: it permits reassignment, because it automatically delegates to either __setitem__
or __getitem__
, depending on whether assignment is being performed.
Therefore, you could do something like titles[titles.year >= 1950] = 'Nothing'
if you wanted. However, in all cases, titles(titles.year >= 1950) = 'Nothing'
delegates to __call_
, and therefore will fail in the following way:
SyntaxError: can't assign to function call
1
"Now, in mathematics, brackets can also be used to denote precedence, in conjunction with parentheses, in a case where multiple nested parentheses would be confusing." This might hit on the main confusion if the OP is familiar with this usage in algebra.
– Code-Apprentice
3 hours ago
Wow, thank you for the fulsome response. Yes, while my background in algebra is small, that has definitely caused me some confusion learning Python. This is the point - [] vs (). In particular, in Pandas. I am trying to be able to understand the logic, (hence my internal mentalese). So in Pandas, are [] for indexing and not making a series?
– DataNoob7
2 hours ago
@DataNoob7 Remember that indexing creates some form of subset! So, if you have aSeries
, you can index it to get anotherSeries
. If you mean from raw data, then that's a function call - something likepd.Series(data)
.
– gmds
2 hours ago
@gmds Stupid question - How is indexing defined in computer science. To me, the word "indexing" is just creating a "table of contents" to identify different parts of something. But you haven't actually extracted anything yet. So your phrase "indexing creates some form of subset" throws me off. To me you are just assigning a number, a letter, etc to your dataframe so you have not taken a subset of anything yet?
– DataNoob7
2 hours ago
@DataNoob7 In this context, "indexing" is something you do to a collection, which is an object that contains a number (including one or zero) objects. Examples of collections arelists
,tuples
,dicts
andpd.Series
. Indexing basically tells the collection to return some subset of the objects it contains, based on the arguments passed to the indexing function. For instance, forlists
, you pass integer indices and get the elements at those positions.
– gmds
2 mins ago
add a comment |
This is not about lists
vs pd.Series
, but rather about the function of parentheses (()
) vs brackets ([]
) in Python.
Parentheses are used in two main cases: to modify the order of precedence of operations, and to delimit arguments when calling functions.
The difference between 1 + 2 * 3
and (1 + 2) * 3
is obvious, and if you want to pass a
and b
to a function f
, f a b
will not work, unlike in, say, Haskell.
We are concerned mostly with the first use here; for example, in this line:
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')
Without the parentheses, you would be calling that chain of methods on 10
, which wouldn't make sense. Clearly, you want to call them on the result of the parenthesised expression.
Now, in mathematics, brackets can also be used to denote precedence, in conjunction with parentheses, in a case where multiple nested parentheses would be confusing. For example, the two may be equivalent in mathematics:
[(1 + 2) * 3] ** 4
((1 + 2) * 3) ** 4
However, that is not the case in Python: ((1 + 2) * 3) ** 4
can be evaluated, whereas [(1 + 2) * 3] ** 4
is a TypeError
, since the part within brackets resolves to a list
, and you can't perform exponentiation on lists
.
Rather, what happens in something like titles[titles.year >= 1950]
is not directly relevant to precedence (though of course anything outside the brackets will not be part of the inner expression).
Instead, the brackets represent indexing; in some way, the value of titles.year >= 1950
is used to get elements from titles
(this is done using overloading of the __getitem__
dunder method).
The exact nature of this indexing may differ; lists
take integers, dicts
take any hashable object and pd.Series
take, among other things, boolean pd.Series
(that is what is happening here), but they ultimately represent some way to subset the indexed object.
Semantically, therefore, we can see that brackets mean something different from parentheses, and are not interchangeable.
For completeness, using brackets as opposed to parentheses has one tangible benefit: it permits reassignment, because it automatically delegates to either __setitem__
or __getitem__
, depending on whether assignment is being performed.
Therefore, you could do something like titles[titles.year >= 1950] = 'Nothing'
if you wanted. However, in all cases, titles(titles.year >= 1950) = 'Nothing'
delegates to __call_
, and therefore will fail in the following way:
SyntaxError: can't assign to function call
This is not about lists
vs pd.Series
, but rather about the function of parentheses (()
) vs brackets ([]
) in Python.
Parentheses are used in two main cases: to modify the order of precedence of operations, and to delimit arguments when calling functions.
The difference between 1 + 2 * 3
and (1 + 2) * 3
is obvious, and if you want to pass a
and b
to a function f
, f a b
will not work, unlike in, say, Haskell.
We are concerned mostly with the first use here; for example, in this line:
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')
Without the parentheses, you would be calling that chain of methods on 10
, which wouldn't make sense. Clearly, you want to call them on the result of the parenthesised expression.
Now, in mathematics, brackets can also be used to denote precedence, in conjunction with parentheses, in a case where multiple nested parentheses would be confusing. For example, the two may be equivalent in mathematics:
[(1 + 2) * 3] ** 4
((1 + 2) * 3) ** 4
However, that is not the case in Python: ((1 + 2) * 3) ** 4
can be evaluated, whereas [(1 + 2) * 3] ** 4
is a TypeError
, since the part within brackets resolves to a list
, and you can't perform exponentiation on lists
.
Rather, what happens in something like titles[titles.year >= 1950]
is not directly relevant to precedence (though of course anything outside the brackets will not be part of the inner expression).
Instead, the brackets represent indexing; in some way, the value of titles.year >= 1950
is used to get elements from titles
(this is done using overloading of the __getitem__
dunder method).
The exact nature of this indexing may differ; lists
take integers, dicts
take any hashable object and pd.Series
take, among other things, boolean pd.Series
(that is what is happening here), but they ultimately represent some way to subset the indexed object.
Semantically, therefore, we can see that brackets mean something different from parentheses, and are not interchangeable.
For completeness, using brackets as opposed to parentheses has one tangible benefit: it permits reassignment, because it automatically delegates to either __setitem__
or __getitem__
, depending on whether assignment is being performed.
Therefore, you could do something like titles[titles.year >= 1950] = 'Nothing'
if you wanted. However, in all cases, titles(titles.year >= 1950) = 'Nothing'
delegates to __call_
, and therefore will fail in the following way:
SyntaxError: can't assign to function call
edited 3 hours ago
answered 3 hours ago
gmdsgmds
10.4k1037
10.4k1037
1
"Now, in mathematics, brackets can also be used to denote precedence, in conjunction with parentheses, in a case where multiple nested parentheses would be confusing." This might hit on the main confusion if the OP is familiar with this usage in algebra.
– Code-Apprentice
3 hours ago
Wow, thank you for the fulsome response. Yes, while my background in algebra is small, that has definitely caused me some confusion learning Python. This is the point - [] vs (). In particular, in Pandas. I am trying to be able to understand the logic, (hence my internal mentalese). So in Pandas, are [] for indexing and not making a series?
– DataNoob7
2 hours ago
@DataNoob7 Remember that indexing creates some form of subset! So, if you have aSeries
, you can index it to get anotherSeries
. If you mean from raw data, then that's a function call - something likepd.Series(data)
.
– gmds
2 hours ago
@gmds Stupid question - How is indexing defined in computer science. To me, the word "indexing" is just creating a "table of contents" to identify different parts of something. But you haven't actually extracted anything yet. So your phrase "indexing creates some form of subset" throws me off. To me you are just assigning a number, a letter, etc to your dataframe so you have not taken a subset of anything yet?
– DataNoob7
2 hours ago
@DataNoob7 In this context, "indexing" is something you do to a collection, which is an object that contains a number (including one or zero) objects. Examples of collections arelists
,tuples
,dicts
andpd.Series
. Indexing basically tells the collection to return some subset of the objects it contains, based on the arguments passed to the indexing function. For instance, forlists
, you pass integer indices and get the elements at those positions.
– gmds
2 mins ago
add a comment |
1
"Now, in mathematics, brackets can also be used to denote precedence, in conjunction with parentheses, in a case where multiple nested parentheses would be confusing." This might hit on the main confusion if the OP is familiar with this usage in algebra.
– Code-Apprentice
3 hours ago
Wow, thank you for the fulsome response. Yes, while my background in algebra is small, that has definitely caused me some confusion learning Python. This is the point - [] vs (). In particular, in Pandas. I am trying to be able to understand the logic, (hence my internal mentalese). So in Pandas, are [] for indexing and not making a series?
– DataNoob7
2 hours ago
@DataNoob7 Remember that indexing creates some form of subset! So, if you have aSeries
, you can index it to get anotherSeries
. If you mean from raw data, then that's a function call - something likepd.Series(data)
.
– gmds
2 hours ago
@gmds Stupid question - How is indexing defined in computer science. To me, the word "indexing" is just creating a "table of contents" to identify different parts of something. But you haven't actually extracted anything yet. So your phrase "indexing creates some form of subset" throws me off. To me you are just assigning a number, a letter, etc to your dataframe so you have not taken a subset of anything yet?
– DataNoob7
2 hours ago
@DataNoob7 In this context, "indexing" is something you do to a collection, which is an object that contains a number (including one or zero) objects. Examples of collections arelists
,tuples
,dicts
andpd.Series
. Indexing basically tells the collection to return some subset of the objects it contains, based on the arguments passed to the indexing function. For instance, forlists
, you pass integer indices and get the elements at those positions.
– gmds
2 mins ago
1
1
"Now, in mathematics, brackets can also be used to denote precedence, in conjunction with parentheses, in a case where multiple nested parentheses would be confusing." This might hit on the main confusion if the OP is familiar with this usage in algebra.
– Code-Apprentice
3 hours ago
"Now, in mathematics, brackets can also be used to denote precedence, in conjunction with parentheses, in a case where multiple nested parentheses would be confusing." This might hit on the main confusion if the OP is familiar with this usage in algebra.
– Code-Apprentice
3 hours ago
Wow, thank you for the fulsome response. Yes, while my background in algebra is small, that has definitely caused me some confusion learning Python. This is the point - [] vs (). In particular, in Pandas. I am trying to be able to understand the logic, (hence my internal mentalese). So in Pandas, are [] for indexing and not making a series?
– DataNoob7
2 hours ago
Wow, thank you for the fulsome response. Yes, while my background in algebra is small, that has definitely caused me some confusion learning Python. This is the point - [] vs (). In particular, in Pandas. I am trying to be able to understand the logic, (hence my internal mentalese). So in Pandas, are [] for indexing and not making a series?
– DataNoob7
2 hours ago
@DataNoob7 Remember that indexing creates some form of subset! So, if you have a
Series
, you can index it to get another Series
. If you mean from raw data, then that's a function call - something like pd.Series(data)
.– gmds
2 hours ago
@DataNoob7 Remember that indexing creates some form of subset! So, if you have a
Series
, you can index it to get another Series
. If you mean from raw data, then that's a function call - something like pd.Series(data)
.– gmds
2 hours ago
@gmds Stupid question - How is indexing defined in computer science. To me, the word "indexing" is just creating a "table of contents" to identify different parts of something. But you haven't actually extracted anything yet. So your phrase "indexing creates some form of subset" throws me off. To me you are just assigning a number, a letter, etc to your dataframe so you have not taken a subset of anything yet?
– DataNoob7
2 hours ago
@gmds Stupid question - How is indexing defined in computer science. To me, the word "indexing" is just creating a "table of contents" to identify different parts of something. But you haven't actually extracted anything yet. So your phrase "indexing creates some form of subset" throws me off. To me you are just assigning a number, a letter, etc to your dataframe so you have not taken a subset of anything yet?
– DataNoob7
2 hours ago
@DataNoob7 In this context, "indexing" is something you do to a collection, which is an object that contains a number (including one or zero) objects. Examples of collections are
lists
, tuples
, dicts
and pd.Series
. Indexing basically tells the collection to return some subset of the objects it contains, based on the arguments passed to the indexing function. For instance, for lists
, you pass integer indices and get the elements at those positions.– gmds
2 mins ago
@DataNoob7 In this context, "indexing" is something you do to a collection, which is an object that contains a number (including one or zero) objects. Examples of collections are
lists
, tuples
, dicts
and pd.Series
. Indexing basically tells the collection to return some subset of the objects it contains, based on the arguments passed to the indexing function. For instance, for lists
, you pass integer indices and get the elements at those positions.– gmds
2 mins ago
add a comment |
Square brackets are used for indexes on lists and dictionaries (and things that act like these). On the other hand, parentheses are used for a variety of reasons. In this case, they are used for grouping in (t.year // 10 * 10)
or as a function call in value_counts()
and other places.
In the case of a library like pandas, whether you use indexing notation with []
or a function call is entirely determined by the implementation of the library. You can learn these details through tutorials and the library's documentation.
Before digging deeper into the pandas library, I suggest that you study the basics of Python syntax. The official tutorial is a good place to start.
On a side note, when you write code, do not make each line as complex as what you see in these examples. You should instead break things into smaller pieces and assign intermediate parts to variables. For example, you can take
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')
and turn it into
decade = (t.year // 10 * 10)
counts = decated.value_counts()
sorted = counts.sort_index()
sorted.plot(kind='bar')
1
I have to agree. If just starting out (especially if you're new to programming in general), start with python basics before jumping into the data science part w/ pandas, numpy, etc.
– Mark Ribau
3 hours ago
@anyone I knowsorted
is a bad name here because of the builtin function. If anyone has a better suggestion, feel free to edit.
– Code-Apprentice
3 hours ago
What aboutsorted_counts
, or, to be more specific (and, unfortunately, verbose),index_sorted_counts
?
– gmds
3 hours ago
I have started "Python basics" through other means, but I will check out the official tutorial. I have a lot of data that I need to analyze for work(many datasets ranging from 700k to 40 million), so I need to accelerate this though. The frustrating part is that I know what I have to, but it is translating it into Python code that is very difficult. I understood [] to denote a series in pandas, but it is also indexing? Indexing outside of pandas? To the point of code length - maybe that is the issue.
– DataNoob7
2 hours ago
add a comment |
Square brackets are used for indexes on lists and dictionaries (and things that act like these). On the other hand, parentheses are used for a variety of reasons. In this case, they are used for grouping in (t.year // 10 * 10)
or as a function call in value_counts()
and other places.
In the case of a library like pandas, whether you use indexing notation with []
or a function call is entirely determined by the implementation of the library. You can learn these details through tutorials and the library's documentation.
Before digging deeper into the pandas library, I suggest that you study the basics of Python syntax. The official tutorial is a good place to start.
On a side note, when you write code, do not make each line as complex as what you see in these examples. You should instead break things into smaller pieces and assign intermediate parts to variables. For example, you can take
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')
and turn it into
decade = (t.year // 10 * 10)
counts = decated.value_counts()
sorted = counts.sort_index()
sorted.plot(kind='bar')
1
I have to agree. If just starting out (especially if you're new to programming in general), start with python basics before jumping into the data science part w/ pandas, numpy, etc.
– Mark Ribau
3 hours ago
@anyone I knowsorted
is a bad name here because of the builtin function. If anyone has a better suggestion, feel free to edit.
– Code-Apprentice
3 hours ago
What aboutsorted_counts
, or, to be more specific (and, unfortunately, verbose),index_sorted_counts
?
– gmds
3 hours ago
I have started "Python basics" through other means, but I will check out the official tutorial. I have a lot of data that I need to analyze for work(many datasets ranging from 700k to 40 million), so I need to accelerate this though. The frustrating part is that I know what I have to, but it is translating it into Python code that is very difficult. I understood [] to denote a series in pandas, but it is also indexing? Indexing outside of pandas? To the point of code length - maybe that is the issue.
– DataNoob7
2 hours ago
add a comment |
Square brackets are used for indexes on lists and dictionaries (and things that act like these). On the other hand, parentheses are used for a variety of reasons. In this case, they are used for grouping in (t.year // 10 * 10)
or as a function call in value_counts()
and other places.
In the case of a library like pandas, whether you use indexing notation with []
or a function call is entirely determined by the implementation of the library. You can learn these details through tutorials and the library's documentation.
Before digging deeper into the pandas library, I suggest that you study the basics of Python syntax. The official tutorial is a good place to start.
On a side note, when you write code, do not make each line as complex as what you see in these examples. You should instead break things into smaller pieces and assign intermediate parts to variables. For example, you can take
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')
and turn it into
decade = (t.year // 10 * 10)
counts = decated.value_counts()
sorted = counts.sort_index()
sorted.plot(kind='bar')
Square brackets are used for indexes on lists and dictionaries (and things that act like these). On the other hand, parentheses are used for a variety of reasons. In this case, they are used for grouping in (t.year // 10 * 10)
or as a function call in value_counts()
and other places.
In the case of a library like pandas, whether you use indexing notation with []
or a function call is entirely determined by the implementation of the library. You can learn these details through tutorials and the library's documentation.
Before digging deeper into the pandas library, I suggest that you study the basics of Python syntax. The official tutorial is a good place to start.
On a side note, when you write code, do not make each line as complex as what you see in these examples. You should instead break things into smaller pieces and assign intermediate parts to variables. For example, you can take
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')
and turn it into
decade = (t.year // 10 * 10)
counts = decated.value_counts()
sorted = counts.sort_index()
sorted.plot(kind='bar')
edited 3 hours ago
answered 3 hours ago
Code-ApprenticeCode-Apprentice
49.8k1492181
49.8k1492181
1
I have to agree. If just starting out (especially if you're new to programming in general), start with python basics before jumping into the data science part w/ pandas, numpy, etc.
– Mark Ribau
3 hours ago
@anyone I knowsorted
is a bad name here because of the builtin function. If anyone has a better suggestion, feel free to edit.
– Code-Apprentice
3 hours ago
What aboutsorted_counts
, or, to be more specific (and, unfortunately, verbose),index_sorted_counts
?
– gmds
3 hours ago
I have started "Python basics" through other means, but I will check out the official tutorial. I have a lot of data that I need to analyze for work(many datasets ranging from 700k to 40 million), so I need to accelerate this though. The frustrating part is that I know what I have to, but it is translating it into Python code that is very difficult. I understood [] to denote a series in pandas, but it is also indexing? Indexing outside of pandas? To the point of code length - maybe that is the issue.
– DataNoob7
2 hours ago
add a comment |
1
I have to agree. If just starting out (especially if you're new to programming in general), start with python basics before jumping into the data science part w/ pandas, numpy, etc.
– Mark Ribau
3 hours ago
@anyone I knowsorted
is a bad name here because of the builtin function. If anyone has a better suggestion, feel free to edit.
– Code-Apprentice
3 hours ago
What aboutsorted_counts
, or, to be more specific (and, unfortunately, verbose),index_sorted_counts
?
– gmds
3 hours ago
I have started "Python basics" through other means, but I will check out the official tutorial. I have a lot of data that I need to analyze for work(many datasets ranging from 700k to 40 million), so I need to accelerate this though. The frustrating part is that I know what I have to, but it is translating it into Python code that is very difficult. I understood [] to denote a series in pandas, but it is also indexing? Indexing outside of pandas? To the point of code length - maybe that is the issue.
– DataNoob7
2 hours ago
1
1
I have to agree. If just starting out (especially if you're new to programming in general), start with python basics before jumping into the data science part w/ pandas, numpy, etc.
– Mark Ribau
3 hours ago
I have to agree. If just starting out (especially if you're new to programming in general), start with python basics before jumping into the data science part w/ pandas, numpy, etc.
– Mark Ribau
3 hours ago
@anyone I know
sorted
is a bad name here because of the builtin function. If anyone has a better suggestion, feel free to edit.– Code-Apprentice
3 hours ago
@anyone I know
sorted
is a bad name here because of the builtin function. If anyone has a better suggestion, feel free to edit.– Code-Apprentice
3 hours ago
What about
sorted_counts
, or, to be more specific (and, unfortunately, verbose), index_sorted_counts
?– gmds
3 hours ago
What about
sorted_counts
, or, to be more specific (and, unfortunately, verbose), index_sorted_counts
?– gmds
3 hours ago
I have started "Python basics" through other means, but I will check out the official tutorial. I have a lot of data that I need to analyze for work(many datasets ranging from 700k to 40 million), so I need to accelerate this though. The frustrating part is that I know what I have to, but it is translating it into Python code that is very difficult. I understood [] to denote a series in pandas, but it is also indexing? Indexing outside of pandas? To the point of code length - maybe that is the issue.
– DataNoob7
2 hours ago
I have started "Python basics" through other means, but I will check out the official tutorial. I have a lot of data that I need to analyze for work(many datasets ranging from 700k to 40 million), so I need to accelerate this though. The frustrating part is that I know what I have to, but it is translating it into Python code that is very difficult. I understood [] to denote a series in pandas, but it is also indexing? Indexing outside of pandas? To the point of code length - maybe that is the issue.
– DataNoob7
2 hours ago
add a comment |
t = titles
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')
titles
is a data frame. year
is a column in that frame. In order, the operations are
- Divide the year by 10 (integer division) and multiply by 10. This truncates the last digit to 0, so that each year is the beginning of its decade. The result of this is another column, the same length as the original.
- Count the values; this will produce a new table with an entry (year, frequency) for each decade-year.
- Sort this table by the default index
- Make a bar plot of the result.
Does that get you going?
Thanks - I can understand what the code means, it's understanding the logic of why the code is written as such (see my original post).
– DataNoob7
2 hours ago
add a comment |
t = titles
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')
titles
is a data frame. year
is a column in that frame. In order, the operations are
- Divide the year by 10 (integer division) and multiply by 10. This truncates the last digit to 0, so that each year is the beginning of its decade. The result of this is another column, the same length as the original.
- Count the values; this will produce a new table with an entry (year, frequency) for each decade-year.
- Sort this table by the default index
- Make a bar plot of the result.
Does that get you going?
Thanks - I can understand what the code means, it's understanding the logic of why the code is written as such (see my original post).
– DataNoob7
2 hours ago
add a comment |
t = titles
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')
titles
is a data frame. year
is a column in that frame. In order, the operations are
- Divide the year by 10 (integer division) and multiply by 10. This truncates the last digit to 0, so that each year is the beginning of its decade. The result of this is another column, the same length as the original.
- Count the values; this will produce a new table with an entry (year, frequency) for each decade-year.
- Sort this table by the default index
- Make a bar plot of the result.
Does that get you going?
t = titles
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')
titles
is a data frame. year
is a column in that frame. In order, the operations are
- Divide the year by 10 (integer division) and multiply by 10. This truncates the last digit to 0, so that each year is the beginning of its decade. The result of this is another column, the same length as the original.
- Count the values; this will produce a new table with an entry (year, frequency) for each decade-year.
- Sort this table by the default index
- Make a bar plot of the result.
Does that get you going?
answered 3 hours ago
PrunePrune
47k143760
47k143760
Thanks - I can understand what the code means, it's understanding the logic of why the code is written as such (see my original post).
– DataNoob7
2 hours ago
add a comment |
Thanks - I can understand what the code means, it's understanding the logic of why the code is written as such (see my original post).
– DataNoob7
2 hours ago
Thanks - I can understand what the code means, it's understanding the logic of why the code is written as such (see my original post).
– DataNoob7
2 hours ago
Thanks - I can understand what the code means, it's understanding the logic of why the code is written as such (see my original post).
– DataNoob7
2 hours ago
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f56140150%2funderstanding-python-syntax-in-lists-vs-series%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Off hand, this doesn't appear to be just "vanilla" python. Do you have any libraries you are using? (numpy, scipy, anaconda, etc.) If you had to run a "pip" command, that installs libraries. It would be helpful to note / tag what libraries you are using.
– Mark Ribau
3 hours ago
2
@MarkRibau Looks like
pandas
.– gmds
3 hours ago
1
Judging from the word "dataframes," pandas is right.
– kindall
3 hours ago
1
Where would you expect to use square brackets in your other examples?
– Code-Apprentice
3 hours ago
You could use the brackets on
t.year
as well, you just dont. I'm not sure I understand your confusion, exactly, can you elaborate?– juanpa.arrivillaga
3 hours ago