Why are length-prefixed fields considered hardware unfriendly?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
1
down vote

favorite












One criticism of GENEVE (a network encapsulation protocol) is that it uses tag-length-value fields, and these are hard to process in hardware.



Why is this considered hardware-unfriendly? What approaches would be more friendly to high speed hardware implementation, and why?










share|improve this question

























    up vote
    1
    down vote

    favorite












    One criticism of GENEVE (a network encapsulation protocol) is that it uses tag-length-value fields, and these are hard to process in hardware.



    Why is this considered hardware-unfriendly? What approaches would be more friendly to high speed hardware implementation, and why?










    share|improve this question























      up vote
      1
      down vote

      favorite









      up vote
      1
      down vote

      favorite











      One criticism of GENEVE (a network encapsulation protocol) is that it uses tag-length-value fields, and these are hard to process in hardware.



      Why is this considered hardware-unfriendly? What approaches would be more friendly to high speed hardware implementation, and why?










      share|improve this question













      One criticism of GENEVE (a network encapsulation protocol) is that it uses tag-length-value fields, and these are hard to process in hardware.



      Why is this considered hardware-unfriendly? What approaches would be more friendly to high speed hardware implementation, and why?







      protocol networking






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked 2 hours ago









      Demi

      1312




      1312




















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          3
          down vote













          Tag(or type)-length-value (TLV) is a method of containing different types of info of variable length in a data structure.



          You only need that when there's no common subset of info that every instance of that data structure must have, or when the order of fields must be variable for some reason.



          Think about it this way: if all GENEVE packets had a "destination" field that contained say a 64bit network address, why not simply define that every packet starts with 64bit of destination address, and save yourself one tag and one length field?



          Hardware (and that includes hardware that runs software!) is good at looking up values at specific positions. So, parsing a header where the destination address is always at position x and always has length y is very easy; you just read memory address x and interpret the result as integer of length y. That's what e.g. CPUs do for every single thing they read from RAM.



          If you, on the other hand, to understand a packet, first need to look for specific fields by going through TLV fields (ok, first field says it's not the tag I'm looking for, has length 14, so jump ahead 14 bits, aha, not the field I'm looking for, jump ahead,…) then parsing that packet will be slow. That counts for software just as much as for hardware implementations.



          Even worse than slow, it'll be non-deterministic in complexity. So, some TLV structures might take only a single clock cycle to analyze, others need to iterate through 42 fields to do the same. If you're implementing a signal processing application, accounting for random delays in one of the steps quickly becomes a nightmare, as you suddenly need to buffer input, or apply backpressure, or drop data, just because someone decided to have a flexible data structure.



          In software, it's often cheap and relatively fast to just preallocate a header structure with fixed offsets and "fill in" these fields as you iterate through the incoming TLV structure. But: for that, you need RAM, and often quite a lot of RAM, if you can't know which fields you're looking for the moment you start parsing the structure.



          So, TLV is a common scheme for serialization of weakly structured data for permanent storage or slow transmission. It's usually pretty undesirable for streaming applications, where the same kind of data comes by pretty often (e.g. network packets, video frames, infrastructure operation commands…); in that case, you'd much rather pre-define fixed structures, even if that wastes a bit of transport bandwidth for occasionally unused fields.



          For example, most systems don't use all the fields an Ethernet packet can have. You still wouldn't try to save two bytes – transporting 1490 or 1492 bytes on Gigabit ethernet doesn't make much difference, but having to check for every single packet whether your packet is of type A or type B does have a negative impact.






          @Janka raises an important point: Assume your hardware's whole job is just to copy the whole packet from in- to output. Now, great, instead of telling your DMA engine to copy one packet worth of data from in- to output, you're parsing all input to figure out how long your data is. That is way, way slower than just copying data.




          share|improve this answer


















          • 2




            +1, but the very short answer is: DMA is not going to parse the data, so length fields do nothing but adding overhead.
            – Janka
            1 hour ago











          • Well, @Janka, yeah, that's the short form of "where others can just pick a fixed position and length of data, the TLV user has to parse the full structure first".
            – Marcus Müller
            1 hour ago










          Your Answer




          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("schematics", function ()
          StackExchange.schematics.init();
          );
          , "cicuitlab");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "135"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2felectronics.stackexchange.com%2fquestions%2f399857%2fwhy-are-length-prefixed-fields-considered-hardware-unfriendly%23new-answer', 'question_page');

          );

          Post as a guest






























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          3
          down vote













          Tag(or type)-length-value (TLV) is a method of containing different types of info of variable length in a data structure.



          You only need that when there's no common subset of info that every instance of that data structure must have, or when the order of fields must be variable for some reason.



          Think about it this way: if all GENEVE packets had a "destination" field that contained say a 64bit network address, why not simply define that every packet starts with 64bit of destination address, and save yourself one tag and one length field?



          Hardware (and that includes hardware that runs software!) is good at looking up values at specific positions. So, parsing a header where the destination address is always at position x and always has length y is very easy; you just read memory address x and interpret the result as integer of length y. That's what e.g. CPUs do for every single thing they read from RAM.



          If you, on the other hand, to understand a packet, first need to look for specific fields by going through TLV fields (ok, first field says it's not the tag I'm looking for, has length 14, so jump ahead 14 bits, aha, not the field I'm looking for, jump ahead,…) then parsing that packet will be slow. That counts for software just as much as for hardware implementations.



          Even worse than slow, it'll be non-deterministic in complexity. So, some TLV structures might take only a single clock cycle to analyze, others need to iterate through 42 fields to do the same. If you're implementing a signal processing application, accounting for random delays in one of the steps quickly becomes a nightmare, as you suddenly need to buffer input, or apply backpressure, or drop data, just because someone decided to have a flexible data structure.



          In software, it's often cheap and relatively fast to just preallocate a header structure with fixed offsets and "fill in" these fields as you iterate through the incoming TLV structure. But: for that, you need RAM, and often quite a lot of RAM, if you can't know which fields you're looking for the moment you start parsing the structure.



          So, TLV is a common scheme for serialization of weakly structured data for permanent storage or slow transmission. It's usually pretty undesirable for streaming applications, where the same kind of data comes by pretty often (e.g. network packets, video frames, infrastructure operation commands…); in that case, you'd much rather pre-define fixed structures, even if that wastes a bit of transport bandwidth for occasionally unused fields.



          For example, most systems don't use all the fields an Ethernet packet can have. You still wouldn't try to save two bytes – transporting 1490 or 1492 bytes on Gigabit ethernet doesn't make much difference, but having to check for every single packet whether your packet is of type A or type B does have a negative impact.






          @Janka raises an important point: Assume your hardware's whole job is just to copy the whole packet from in- to output. Now, great, instead of telling your DMA engine to copy one packet worth of data from in- to output, you're parsing all input to figure out how long your data is. That is way, way slower than just copying data.




          share|improve this answer


















          • 2




            +1, but the very short answer is: DMA is not going to parse the data, so length fields do nothing but adding overhead.
            – Janka
            1 hour ago











          • Well, @Janka, yeah, that's the short form of "where others can just pick a fixed position and length of data, the TLV user has to parse the full structure first".
            – Marcus Müller
            1 hour ago














          up vote
          3
          down vote













          Tag(or type)-length-value (TLV) is a method of containing different types of info of variable length in a data structure.



          You only need that when there's no common subset of info that every instance of that data structure must have, or when the order of fields must be variable for some reason.



          Think about it this way: if all GENEVE packets had a "destination" field that contained say a 64bit network address, why not simply define that every packet starts with 64bit of destination address, and save yourself one tag and one length field?



          Hardware (and that includes hardware that runs software!) is good at looking up values at specific positions. So, parsing a header where the destination address is always at position x and always has length y is very easy; you just read memory address x and interpret the result as integer of length y. That's what e.g. CPUs do for every single thing they read from RAM.



          If you, on the other hand, to understand a packet, first need to look for specific fields by going through TLV fields (ok, first field says it's not the tag I'm looking for, has length 14, so jump ahead 14 bits, aha, not the field I'm looking for, jump ahead,…) then parsing that packet will be slow. That counts for software just as much as for hardware implementations.



          Even worse than slow, it'll be non-deterministic in complexity. So, some TLV structures might take only a single clock cycle to analyze, others need to iterate through 42 fields to do the same. If you're implementing a signal processing application, accounting for random delays in one of the steps quickly becomes a nightmare, as you suddenly need to buffer input, or apply backpressure, or drop data, just because someone decided to have a flexible data structure.



          In software, it's often cheap and relatively fast to just preallocate a header structure with fixed offsets and "fill in" these fields as you iterate through the incoming TLV structure. But: for that, you need RAM, and often quite a lot of RAM, if you can't know which fields you're looking for the moment you start parsing the structure.



          So, TLV is a common scheme for serialization of weakly structured data for permanent storage or slow transmission. It's usually pretty undesirable for streaming applications, where the same kind of data comes by pretty often (e.g. network packets, video frames, infrastructure operation commands…); in that case, you'd much rather pre-define fixed structures, even if that wastes a bit of transport bandwidth for occasionally unused fields.



          For example, most systems don't use all the fields an Ethernet packet can have. You still wouldn't try to save two bytes – transporting 1490 or 1492 bytes on Gigabit ethernet doesn't make much difference, but having to check for every single packet whether your packet is of type A or type B does have a negative impact.






          @Janka raises an important point: Assume your hardware's whole job is just to copy the whole packet from in- to output. Now, great, instead of telling your DMA engine to copy one packet worth of data from in- to output, you're parsing all input to figure out how long your data is. That is way, way slower than just copying data.




          share|improve this answer


















          • 2




            +1, but the very short answer is: DMA is not going to parse the data, so length fields do nothing but adding overhead.
            – Janka
            1 hour ago











          • Well, @Janka, yeah, that's the short form of "where others can just pick a fixed position and length of data, the TLV user has to parse the full structure first".
            – Marcus Müller
            1 hour ago












          up vote
          3
          down vote










          up vote
          3
          down vote









          Tag(or type)-length-value (TLV) is a method of containing different types of info of variable length in a data structure.



          You only need that when there's no common subset of info that every instance of that data structure must have, or when the order of fields must be variable for some reason.



          Think about it this way: if all GENEVE packets had a "destination" field that contained say a 64bit network address, why not simply define that every packet starts with 64bit of destination address, and save yourself one tag and one length field?



          Hardware (and that includes hardware that runs software!) is good at looking up values at specific positions. So, parsing a header where the destination address is always at position x and always has length y is very easy; you just read memory address x and interpret the result as integer of length y. That's what e.g. CPUs do for every single thing they read from RAM.



          If you, on the other hand, to understand a packet, first need to look for specific fields by going through TLV fields (ok, first field says it's not the tag I'm looking for, has length 14, so jump ahead 14 bits, aha, not the field I'm looking for, jump ahead,…) then parsing that packet will be slow. That counts for software just as much as for hardware implementations.



          Even worse than slow, it'll be non-deterministic in complexity. So, some TLV structures might take only a single clock cycle to analyze, others need to iterate through 42 fields to do the same. If you're implementing a signal processing application, accounting for random delays in one of the steps quickly becomes a nightmare, as you suddenly need to buffer input, or apply backpressure, or drop data, just because someone decided to have a flexible data structure.



          In software, it's often cheap and relatively fast to just preallocate a header structure with fixed offsets and "fill in" these fields as you iterate through the incoming TLV structure. But: for that, you need RAM, and often quite a lot of RAM, if you can't know which fields you're looking for the moment you start parsing the structure.



          So, TLV is a common scheme for serialization of weakly structured data for permanent storage or slow transmission. It's usually pretty undesirable for streaming applications, where the same kind of data comes by pretty often (e.g. network packets, video frames, infrastructure operation commands…); in that case, you'd much rather pre-define fixed structures, even if that wastes a bit of transport bandwidth for occasionally unused fields.



          For example, most systems don't use all the fields an Ethernet packet can have. You still wouldn't try to save two bytes – transporting 1490 or 1492 bytes on Gigabit ethernet doesn't make much difference, but having to check for every single packet whether your packet is of type A or type B does have a negative impact.






          @Janka raises an important point: Assume your hardware's whole job is just to copy the whole packet from in- to output. Now, great, instead of telling your DMA engine to copy one packet worth of data from in- to output, you're parsing all input to figure out how long your data is. That is way, way slower than just copying data.




          share|improve this answer














          Tag(or type)-length-value (TLV) is a method of containing different types of info of variable length in a data structure.



          You only need that when there's no common subset of info that every instance of that data structure must have, or when the order of fields must be variable for some reason.



          Think about it this way: if all GENEVE packets had a "destination" field that contained say a 64bit network address, why not simply define that every packet starts with 64bit of destination address, and save yourself one tag and one length field?



          Hardware (and that includes hardware that runs software!) is good at looking up values at specific positions. So, parsing a header where the destination address is always at position x and always has length y is very easy; you just read memory address x and interpret the result as integer of length y. That's what e.g. CPUs do for every single thing they read from RAM.



          If you, on the other hand, to understand a packet, first need to look for specific fields by going through TLV fields (ok, first field says it's not the tag I'm looking for, has length 14, so jump ahead 14 bits, aha, not the field I'm looking for, jump ahead,…) then parsing that packet will be slow. That counts for software just as much as for hardware implementations.



          Even worse than slow, it'll be non-deterministic in complexity. So, some TLV structures might take only a single clock cycle to analyze, others need to iterate through 42 fields to do the same. If you're implementing a signal processing application, accounting for random delays in one of the steps quickly becomes a nightmare, as you suddenly need to buffer input, or apply backpressure, or drop data, just because someone decided to have a flexible data structure.



          In software, it's often cheap and relatively fast to just preallocate a header structure with fixed offsets and "fill in" these fields as you iterate through the incoming TLV structure. But: for that, you need RAM, and often quite a lot of RAM, if you can't know which fields you're looking for the moment you start parsing the structure.



          So, TLV is a common scheme for serialization of weakly structured data for permanent storage or slow transmission. It's usually pretty undesirable for streaming applications, where the same kind of data comes by pretty often (e.g. network packets, video frames, infrastructure operation commands…); in that case, you'd much rather pre-define fixed structures, even if that wastes a bit of transport bandwidth for occasionally unused fields.



          For example, most systems don't use all the fields an Ethernet packet can have. You still wouldn't try to save two bytes – transporting 1490 or 1492 bytes on Gigabit ethernet doesn't make much difference, but having to check for every single packet whether your packet is of type A or type B does have a negative impact.






          @Janka raises an important point: Assume your hardware's whole job is just to copy the whole packet from in- to output. Now, great, instead of telling your DMA engine to copy one packet worth of data from in- to output, you're parsing all input to figure out how long your data is. That is way, way slower than just copying data.





          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 1 hour ago

























          answered 1 hour ago









          Marcus Müller

          28.7k35388




          28.7k35388







          • 2




            +1, but the very short answer is: DMA is not going to parse the data, so length fields do nothing but adding overhead.
            – Janka
            1 hour ago











          • Well, @Janka, yeah, that's the short form of "where others can just pick a fixed position and length of data, the TLV user has to parse the full structure first".
            – Marcus Müller
            1 hour ago












          • 2




            +1, but the very short answer is: DMA is not going to parse the data, so length fields do nothing but adding overhead.
            – Janka
            1 hour ago











          • Well, @Janka, yeah, that's the short form of "where others can just pick a fixed position and length of data, the TLV user has to parse the full structure first".
            – Marcus Müller
            1 hour ago







          2




          2




          +1, but the very short answer is: DMA is not going to parse the data, so length fields do nothing but adding overhead.
          – Janka
          1 hour ago





          +1, but the very short answer is: DMA is not going to parse the data, so length fields do nothing but adding overhead.
          – Janka
          1 hour ago













          Well, @Janka, yeah, that's the short form of "where others can just pick a fixed position and length of data, the TLV user has to parse the full structure first".
          – Marcus Müller
          1 hour ago




          Well, @Janka, yeah, that's the short form of "where others can just pick a fixed position and length of data, the TLV user has to parse the full structure first".
          – Marcus Müller
          1 hour ago

















           

          draft saved


          draft discarded















































           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2felectronics.stackexchange.com%2fquestions%2f399857%2fwhy-are-length-prefixed-fields-considered-hardware-unfriendly%23new-answer', 'question_page');

          );

          Post as a guest













































































          Comments

          Popular posts from this blog

          Long meetings (6-7 hours a day): Being “babysat” by supervisor

          Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

          Confectionery