Do neurons of a neural network model a linear relationship?Neural network Model to infer inputs given an...

Is sucess due to hard work sustainable in academic research?

When applying for a visa has there ever been a case of embassy asking for proof of right to be in the present country?

What exactly is meant by "partial function" in functional programming?

C# Toy Robot Simulator

Could you use uppercase or special characters in a password in early Unix?

Minimum number of turns to capture all pieces in Checkers

Can I use work on business ideas to reduce my tax burden?

1-dimensional pure gauge theory

Why do baby boomers have to sell 5% of their retirement accounts by the end of the year?

Is this a new characteristic function for the primes?

Prisoner's dilemma formulation for children

Best ways to compress and store tons of CO2?

Is there any research on the development of attacks against artificial intelligence systems?

Is it poor workplace etiquette to display signs of relative "wealth" at work when others are struggling financially?

Examples of problems with non-convex constraint functions but convex feasible region

How to get to Antarctica without using a travel company

If you have a negative spellcasting ability modifier, how much damage does the Green-Flame Blade cantrip do to the second target below level 5?

When was the famous "sudo warning" introduced? Under what background? By whom?

As a vegetarian, how can I deal with microwaves smelling of meat and fish?

Can Microsoft employees see my data in Azure?

Given a fibonacci number , find just next fibonacci number

Titlesec - vertical align chapter and section

Who inspired the character Geordi La Forge?

What are these objects near the Cosmonaut's faces?



Do neurons of a neural network model a linear relationship?


Neural network Model to infer inputs given an outputNeural Network Cell (Node) TypesWhat does it mean for a neuron in a neural network to be activated?Why do non-linear activation functions not require a specific non-linear relation between its inputs and outputs?How do intermediate layers of a trained neural network look like?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{
margin-bottom:0;
}








3














$begingroup$


I'm certain that this is a very naive question, but I am just beginning to look more deeply at neural networks, having only used decision tree approaches in the past. Also, my formal mathematics training is more than 30 years in the past, so please be kind. :)



As I'm reading François Chollet's book on Deep Learning, I'm struck that it appears that we are effectively treating the weights (kernel and biases) as terms in the standard linear equation ($y=mx+b$), where we instead state




newWeight = (currentWeight . input) + bias



newWeight = (newWeight < 0 ? 0 : newWeight)




Am I reading too much into this, or is this correct (and so fundamental I shouldn't be asking about it)?










share|improve this question









New contributor



David Hoelzer is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$






















    3














    $begingroup$


    I'm certain that this is a very naive question, but I am just beginning to look more deeply at neural networks, having only used decision tree approaches in the past. Also, my formal mathematics training is more than 30 years in the past, so please be kind. :)



    As I'm reading François Chollet's book on Deep Learning, I'm struck that it appears that we are effectively treating the weights (kernel and biases) as terms in the standard linear equation ($y=mx+b$), where we instead state




    newWeight = (currentWeight . input) + bias



    newWeight = (newWeight < 0 ? 0 : newWeight)




    Am I reading too much into this, or is this correct (and so fundamental I shouldn't be asking about it)?










    share|improve this question









    New contributor



    David Hoelzer is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$


















      3












      3








      3


      0



      $begingroup$


      I'm certain that this is a very naive question, but I am just beginning to look more deeply at neural networks, having only used decision tree approaches in the past. Also, my formal mathematics training is more than 30 years in the past, so please be kind. :)



      As I'm reading François Chollet's book on Deep Learning, I'm struck that it appears that we are effectively treating the weights (kernel and biases) as terms in the standard linear equation ($y=mx+b$), where we instead state




      newWeight = (currentWeight . input) + bias



      newWeight = (newWeight < 0 ? 0 : newWeight)




      Am I reading too much into this, or is this correct (and so fundamental I shouldn't be asking about it)?










      share|improve this question









      New contributor



      David Hoelzer is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      I'm certain that this is a very naive question, but I am just beginning to look more deeply at neural networks, having only used decision tree approaches in the past. Also, my formal mathematics training is more than 30 years in the past, so please be kind. :)



      As I'm reading François Chollet's book on Deep Learning, I'm struck that it appears that we are effectively treating the weights (kernel and biases) as terms in the standard linear equation ($y=mx+b$), where we instead state




      newWeight = (currentWeight . input) + bias



      newWeight = (newWeight < 0 ? 0 : newWeight)




      Am I reading too much into this, or is this correct (and so fundamental I shouldn't be asking about it)?







      neural-networks ai-basics activation-function






      share|improve this question









      New contributor



      David Hoelzer is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question









      New contributor



      David Hoelzer is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.








      share|improve this question




      share|improve this question








      edited 42 mins ago









      nbro

      7,5654 gold badges17 silver badges39 bronze badges




      7,5654 gold badges17 silver badges39 bronze badges






      New contributor



      David Hoelzer is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.








      asked 9 hours ago









      David HoelzerDavid Hoelzer

      1163 bronze badges




      1163 bronze badges




      New contributor



      David Hoelzer is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.




      New contributor




      David Hoelzer is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.



























          2 Answers
          2






          active

          oldest

          votes


















          3
















          $begingroup$

          In a neural network (NN), a neuron can act as a linear operator, but it usually acts as a non-linear one. The usual equation of a neuron $i$ in layer $l$ of an NN is



          $$o_i^l = sigma(mathbf{x}_i^l cdot mathbf{w}_i^l + b_i^l),$$



          where $sigma$ is a so-called activation function, which is usually a non-linearity, but it can also be the identity function, $mathbf{x}_i^l$ and $mathbf{w}_i^l$ are the vectors that respectively contain the inputs and the weights for neuron $i$ in layer $l$, and $b_i^l in mathbb{R}$ is a bias. Similarly, the output of a layer of a feed-forward neural network (FFNN) is computed as



          $$mathbf{o}^l = sigma(mathbf{X}^l mathbf{W}^l + mathbf{b}^l).$$



          In your specific example, you set the new weight to $0$, if the output of the linear combination is less than $0$, else you use the output of the linear combination. This is the definition of the ReLU activation function, which is a non-linear function.






          share|improve this answer












          $endgroup$















          • $begingroup$
            Yes, I understand that... And I realize I was using ReLU rather than sigmoid... but am I wrong to see a clear relationship between how we are currently training networks and the classic function that defines a line?
            $endgroup$
            – David Hoelzer
            8 hours ago










          • $begingroup$
            Oh, and a follow-up... From my reading over the last week, I had the impression that sigmoid was out of favor in current thought. Is that not true? I realize it was the standard just a few years ago.
            $endgroup$
            – David Hoelzer
            8 hours ago










          • $begingroup$
            @DavidHoelzer In the equation of a line, $sigma$ is always the identity function, while in the case of NNs, $sigma$ is almost never the identity function. This is the main difference.
            $endgroup$
            – nbro
            8 hours ago










          • $begingroup$
            @DavidHoelzer It seems to me that ReLU (or variations) is used more often (than sigmoids), but I think you can find several discussions on the web related to this topic.
            $endgroup$
            – nbro
            8 hours ago





















          1
















          $begingroup$

          Taking the question from comments on nbro's answer.






          Am I wrong to see a clear relationship between how we are currently training networks and the classic function that defines a line?




          You are right about it. This is an intuitive way to understand neural networks. You can create a neural network that only does simple linear regression, by using linear activations functions in all the layers, such as the neural network (model) output is a linear combination of the inputs. And, this seems like a great way to introduce neural networks to students.



          But, one must also look at the fact that neural networks provide the flexibility to model many kinds of non-linear relationships.






          share|improve this answer










          $endgroup$

















            Your Answer








            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "658"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            noCode: true, onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });







            David Hoelzer is a new contributor. Be nice, and check out our Code of Conduct.










            draft saved

            draft discarded
















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fai.stackexchange.com%2fquestions%2f15877%2fdo-neurons-of-a-neural-network-model-a-linear-relationship%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown


























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            3
















            $begingroup$

            In a neural network (NN), a neuron can act as a linear operator, but it usually acts as a non-linear one. The usual equation of a neuron $i$ in layer $l$ of an NN is



            $$o_i^l = sigma(mathbf{x}_i^l cdot mathbf{w}_i^l + b_i^l),$$



            where $sigma$ is a so-called activation function, which is usually a non-linearity, but it can also be the identity function, $mathbf{x}_i^l$ and $mathbf{w}_i^l$ are the vectors that respectively contain the inputs and the weights for neuron $i$ in layer $l$, and $b_i^l in mathbb{R}$ is a bias. Similarly, the output of a layer of a feed-forward neural network (FFNN) is computed as



            $$mathbf{o}^l = sigma(mathbf{X}^l mathbf{W}^l + mathbf{b}^l).$$



            In your specific example, you set the new weight to $0$, if the output of the linear combination is less than $0$, else you use the output of the linear combination. This is the definition of the ReLU activation function, which is a non-linear function.






            share|improve this answer












            $endgroup$















            • $begingroup$
              Yes, I understand that... And I realize I was using ReLU rather than sigmoid... but am I wrong to see a clear relationship between how we are currently training networks and the classic function that defines a line?
              $endgroup$
              – David Hoelzer
              8 hours ago










            • $begingroup$
              Oh, and a follow-up... From my reading over the last week, I had the impression that sigmoid was out of favor in current thought. Is that not true? I realize it was the standard just a few years ago.
              $endgroup$
              – David Hoelzer
              8 hours ago










            • $begingroup$
              @DavidHoelzer In the equation of a line, $sigma$ is always the identity function, while in the case of NNs, $sigma$ is almost never the identity function. This is the main difference.
              $endgroup$
              – nbro
              8 hours ago










            • $begingroup$
              @DavidHoelzer It seems to me that ReLU (or variations) is used more often (than sigmoids), but I think you can find several discussions on the web related to this topic.
              $endgroup$
              – nbro
              8 hours ago


















            3
















            $begingroup$

            In a neural network (NN), a neuron can act as a linear operator, but it usually acts as a non-linear one. The usual equation of a neuron $i$ in layer $l$ of an NN is



            $$o_i^l = sigma(mathbf{x}_i^l cdot mathbf{w}_i^l + b_i^l),$$



            where $sigma$ is a so-called activation function, which is usually a non-linearity, but it can also be the identity function, $mathbf{x}_i^l$ and $mathbf{w}_i^l$ are the vectors that respectively contain the inputs and the weights for neuron $i$ in layer $l$, and $b_i^l in mathbb{R}$ is a bias. Similarly, the output of a layer of a feed-forward neural network (FFNN) is computed as



            $$mathbf{o}^l = sigma(mathbf{X}^l mathbf{W}^l + mathbf{b}^l).$$



            In your specific example, you set the new weight to $0$, if the output of the linear combination is less than $0$, else you use the output of the linear combination. This is the definition of the ReLU activation function, which is a non-linear function.






            share|improve this answer












            $endgroup$















            • $begingroup$
              Yes, I understand that... And I realize I was using ReLU rather than sigmoid... but am I wrong to see a clear relationship between how we are currently training networks and the classic function that defines a line?
              $endgroup$
              – David Hoelzer
              8 hours ago










            • $begingroup$
              Oh, and a follow-up... From my reading over the last week, I had the impression that sigmoid was out of favor in current thought. Is that not true? I realize it was the standard just a few years ago.
              $endgroup$
              – David Hoelzer
              8 hours ago










            • $begingroup$
              @DavidHoelzer In the equation of a line, $sigma$ is always the identity function, while in the case of NNs, $sigma$ is almost never the identity function. This is the main difference.
              $endgroup$
              – nbro
              8 hours ago










            • $begingroup$
              @DavidHoelzer It seems to me that ReLU (or variations) is used more often (than sigmoids), but I think you can find several discussions on the web related to this topic.
              $endgroup$
              – nbro
              8 hours ago
















            3














            3










            3







            $begingroup$

            In a neural network (NN), a neuron can act as a linear operator, but it usually acts as a non-linear one. The usual equation of a neuron $i$ in layer $l$ of an NN is



            $$o_i^l = sigma(mathbf{x}_i^l cdot mathbf{w}_i^l + b_i^l),$$



            where $sigma$ is a so-called activation function, which is usually a non-linearity, but it can also be the identity function, $mathbf{x}_i^l$ and $mathbf{w}_i^l$ are the vectors that respectively contain the inputs and the weights for neuron $i$ in layer $l$, and $b_i^l in mathbb{R}$ is a bias. Similarly, the output of a layer of a feed-forward neural network (FFNN) is computed as



            $$mathbf{o}^l = sigma(mathbf{X}^l mathbf{W}^l + mathbf{b}^l).$$



            In your specific example, you set the new weight to $0$, if the output of the linear combination is less than $0$, else you use the output of the linear combination. This is the definition of the ReLU activation function, which is a non-linear function.






            share|improve this answer












            $endgroup$



            In a neural network (NN), a neuron can act as a linear operator, but it usually acts as a non-linear one. The usual equation of a neuron $i$ in layer $l$ of an NN is



            $$o_i^l = sigma(mathbf{x}_i^l cdot mathbf{w}_i^l + b_i^l),$$



            where $sigma$ is a so-called activation function, which is usually a non-linearity, but it can also be the identity function, $mathbf{x}_i^l$ and $mathbf{w}_i^l$ are the vectors that respectively contain the inputs and the weights for neuron $i$ in layer $l$, and $b_i^l in mathbb{R}$ is a bias. Similarly, the output of a layer of a feed-forward neural network (FFNN) is computed as



            $$mathbf{o}^l = sigma(mathbf{X}^l mathbf{W}^l + mathbf{b}^l).$$



            In your specific example, you set the new weight to $0$, if the output of the linear combination is less than $0$, else you use the output of the linear combination. This is the definition of the ReLU activation function, which is a non-linear function.







            share|improve this answer















            share|improve this answer




            share|improve this answer








            edited 8 hours ago

























            answered 8 hours ago









            nbronbro

            7,5654 gold badges17 silver badges39 bronze badges




            7,5654 gold badges17 silver badges39 bronze badges















            • $begingroup$
              Yes, I understand that... And I realize I was using ReLU rather than sigmoid... but am I wrong to see a clear relationship between how we are currently training networks and the classic function that defines a line?
              $endgroup$
              – David Hoelzer
              8 hours ago










            • $begingroup$
              Oh, and a follow-up... From my reading over the last week, I had the impression that sigmoid was out of favor in current thought. Is that not true? I realize it was the standard just a few years ago.
              $endgroup$
              – David Hoelzer
              8 hours ago










            • $begingroup$
              @DavidHoelzer In the equation of a line, $sigma$ is always the identity function, while in the case of NNs, $sigma$ is almost never the identity function. This is the main difference.
              $endgroup$
              – nbro
              8 hours ago










            • $begingroup$
              @DavidHoelzer It seems to me that ReLU (or variations) is used more often (than sigmoids), but I think you can find several discussions on the web related to this topic.
              $endgroup$
              – nbro
              8 hours ago




















            • $begingroup$
              Yes, I understand that... And I realize I was using ReLU rather than sigmoid... but am I wrong to see a clear relationship between how we are currently training networks and the classic function that defines a line?
              $endgroup$
              – David Hoelzer
              8 hours ago










            • $begingroup$
              Oh, and a follow-up... From my reading over the last week, I had the impression that sigmoid was out of favor in current thought. Is that not true? I realize it was the standard just a few years ago.
              $endgroup$
              – David Hoelzer
              8 hours ago










            • $begingroup$
              @DavidHoelzer In the equation of a line, $sigma$ is always the identity function, while in the case of NNs, $sigma$ is almost never the identity function. This is the main difference.
              $endgroup$
              – nbro
              8 hours ago










            • $begingroup$
              @DavidHoelzer It seems to me that ReLU (or variations) is used more often (than sigmoids), but I think you can find several discussions on the web related to this topic.
              $endgroup$
              – nbro
              8 hours ago


















            $begingroup$
            Yes, I understand that... And I realize I was using ReLU rather than sigmoid... but am I wrong to see a clear relationship between how we are currently training networks and the classic function that defines a line?
            $endgroup$
            – David Hoelzer
            8 hours ago




            $begingroup$
            Yes, I understand that... And I realize I was using ReLU rather than sigmoid... but am I wrong to see a clear relationship between how we are currently training networks and the classic function that defines a line?
            $endgroup$
            – David Hoelzer
            8 hours ago












            $begingroup$
            Oh, and a follow-up... From my reading over the last week, I had the impression that sigmoid was out of favor in current thought. Is that not true? I realize it was the standard just a few years ago.
            $endgroup$
            – David Hoelzer
            8 hours ago




            $begingroup$
            Oh, and a follow-up... From my reading over the last week, I had the impression that sigmoid was out of favor in current thought. Is that not true? I realize it was the standard just a few years ago.
            $endgroup$
            – David Hoelzer
            8 hours ago












            $begingroup$
            @DavidHoelzer In the equation of a line, $sigma$ is always the identity function, while in the case of NNs, $sigma$ is almost never the identity function. This is the main difference.
            $endgroup$
            – nbro
            8 hours ago




            $begingroup$
            @DavidHoelzer In the equation of a line, $sigma$ is always the identity function, while in the case of NNs, $sigma$ is almost never the identity function. This is the main difference.
            $endgroup$
            – nbro
            8 hours ago












            $begingroup$
            @DavidHoelzer It seems to me that ReLU (or variations) is used more often (than sigmoids), but I think you can find several discussions on the web related to this topic.
            $endgroup$
            – nbro
            8 hours ago






            $begingroup$
            @DavidHoelzer It seems to me that ReLU (or variations) is used more often (than sigmoids), but I think you can find several discussions on the web related to this topic.
            $endgroup$
            – nbro
            8 hours ago















            1
















            $begingroup$

            Taking the question from comments on nbro's answer.






            Am I wrong to see a clear relationship between how we are currently training networks and the classic function that defines a line?




            You are right about it. This is an intuitive way to understand neural networks. You can create a neural network that only does simple linear regression, by using linear activations functions in all the layers, such as the neural network (model) output is a linear combination of the inputs. And, this seems like a great way to introduce neural networks to students.



            But, one must also look at the fact that neural networks provide the flexibility to model many kinds of non-linear relationships.






            share|improve this answer










            $endgroup$




















              1
















              $begingroup$

              Taking the question from comments on nbro's answer.






              Am I wrong to see a clear relationship between how we are currently training networks and the classic function that defines a line?




              You are right about it. This is an intuitive way to understand neural networks. You can create a neural network that only does simple linear regression, by using linear activations functions in all the layers, such as the neural network (model) output is a linear combination of the inputs. And, this seems like a great way to introduce neural networks to students.



              But, one must also look at the fact that neural networks provide the flexibility to model many kinds of non-linear relationships.






              share|improve this answer










              $endgroup$


















                1














                1










                1







                $begingroup$

                Taking the question from comments on nbro's answer.






                Am I wrong to see a clear relationship between how we are currently training networks and the classic function that defines a line?




                You are right about it. This is an intuitive way to understand neural networks. You can create a neural network that only does simple linear regression, by using linear activations functions in all the layers, such as the neural network (model) output is a linear combination of the inputs. And, this seems like a great way to introduce neural networks to students.



                But, one must also look at the fact that neural networks provide the flexibility to model many kinds of non-linear relationships.






                share|improve this answer










                $endgroup$



                Taking the question from comments on nbro's answer.






                Am I wrong to see a clear relationship between how we are currently training networks and the classic function that defines a line?




                You are right about it. This is an intuitive way to understand neural networks. You can create a neural network that only does simple linear regression, by using linear activations functions in all the layers, such as the neural network (model) output is a linear combination of the inputs. And, this seems like a great way to introduce neural networks to students.



                But, one must also look at the fact that neural networks provide the flexibility to model many kinds of non-linear relationships.







                share|improve this answer













                share|improve this answer




                share|improve this answer










                answered 3 hours ago









                naivenaive

                4672 silver badges11 bronze badges




                4672 silver badges11 bronze badges


























                    David Hoelzer is a new contributor. Be nice, and check out our Code of Conduct.










                    draft saved

                    draft discarded

















                    David Hoelzer is a new contributor. Be nice, and check out our Code of Conduct.













                    David Hoelzer is a new contributor. Be nice, and check out our Code of Conduct.












                    David Hoelzer is a new contributor. Be nice, and check out our Code of Conduct.
















                    Thanks for contributing an answer to Artificial Intelligence Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fai.stackexchange.com%2fquestions%2f15877%2fdo-neurons-of-a-neural-network-model-a-linear-relationship%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown









                    Popular posts from this blog

                    Hudson River Historic District Contents Geography History The district today Aesthetics Cultural...

                    The number designs the writing. Feandra Aversely Definition: The act of ingrafting a sprig or shoot of one...

                    Ayherre Geografie Demografie Externe links Navigatiemenu43° 23′ NB, 1° 15′ WL43° 23′ NB, 1°...