Importing large SQL dump with millions of INSERT statements

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
3
down vote

favorite
1












I need to import a big .sql file (8.1GB when unpacked) into PostgreSQL. I tried to use i /path/to/file.sql but it is way too slow.



How does one speed up the import? I need to import this data weekly.



The first 2000 lines can be found here, while the compressed 1 GB dump can be found here



--
-- PostgreSQL database dump
--

-- Dumped from database version 9.5.3
-- Dumped by pg_dump version 9.5.2

SET statement_timeout = 0;
SET lock_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SET check_function_bodies = false;
SET client_min_messages = warning;
SET row_security = off;

--
-- Name: rpo; Type: SCHEMA; Schema: -; Owner: -
--


That's the only way I can get the data. The full file has approx. 38,000,000 lines. How can I speed up the import?







share|improve this question




























    up vote
    3
    down vote

    favorite
    1












    I need to import a big .sql file (8.1GB when unpacked) into PostgreSQL. I tried to use i /path/to/file.sql but it is way too slow.



    How does one speed up the import? I need to import this data weekly.



    The first 2000 lines can be found here, while the compressed 1 GB dump can be found here



    --
    -- PostgreSQL database dump
    --

    -- Dumped from database version 9.5.3
    -- Dumped by pg_dump version 9.5.2

    SET statement_timeout = 0;
    SET lock_timeout = 0;
    SET client_encoding = 'UTF8';
    SET standard_conforming_strings = on;
    SET check_function_bodies = false;
    SET client_min_messages = warning;
    SET row_security = off;

    --
    -- Name: rpo; Type: SCHEMA; Schema: -; Owner: -
    --


    That's the only way I can get the data. The full file has approx. 38,000,000 lines. How can I speed up the import?







    share|improve this question
























      up vote
      3
      down vote

      favorite
      1









      up vote
      3
      down vote

      favorite
      1






      1





      I need to import a big .sql file (8.1GB when unpacked) into PostgreSQL. I tried to use i /path/to/file.sql but it is way too slow.



      How does one speed up the import? I need to import this data weekly.



      The first 2000 lines can be found here, while the compressed 1 GB dump can be found here



      --
      -- PostgreSQL database dump
      --

      -- Dumped from database version 9.5.3
      -- Dumped by pg_dump version 9.5.2

      SET statement_timeout = 0;
      SET lock_timeout = 0;
      SET client_encoding = 'UTF8';
      SET standard_conforming_strings = on;
      SET check_function_bodies = false;
      SET client_min_messages = warning;
      SET row_security = off;

      --
      -- Name: rpo; Type: SCHEMA; Schema: -; Owner: -
      --


      That's the only way I can get the data. The full file has approx. 38,000,000 lines. How can I speed up the import?







      share|improve this question














      I need to import a big .sql file (8.1GB when unpacked) into PostgreSQL. I tried to use i /path/to/file.sql but it is way too slow.



      How does one speed up the import? I need to import this data weekly.



      The first 2000 lines can be found here, while the compressed 1 GB dump can be found here



      --
      -- PostgreSQL database dump
      --

      -- Dumped from database version 9.5.3
      -- Dumped by pg_dump version 9.5.2

      SET statement_timeout = 0;
      SET lock_timeout = 0;
      SET client_encoding = 'UTF8';
      SET standard_conforming_strings = on;
      SET check_function_bodies = false;
      SET client_min_messages = warning;
      SET row_security = off;

      --
      -- Name: rpo; Type: SCHEMA; Schema: -; Owner: -
      --


      That's the only way I can get the data. The full file has approx. 38,000,000 lines. How can I speed up the import?









      share|improve this question













      share|improve this question




      share|improve this question








      edited Aug 29 at 21:47









      Evan Carroll

      28.4k855184




      28.4k855184










      asked Aug 29 at 10:24









      Ondrej Vrabel

      1254




      1254




















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          11
          down vote



          accepted










          This dump was dumped as individual statements (with pg_dump --inserts)



          INSERT INTO esa2010_codes VALUES (11002, 'Národn
          INSERT INTO esa2010_codes VALUES (11003, 'Nefina
          INSERT INTO esa2010_codes VALUES (12502, 'Národn
          INSERT INTO esa2010_codes VALUES (11001, 'Verejn
          INSERT INTO esa2010_codes VALUES (12602, 'Národn
          INSERT INTO esa2010_codes VALUES (12603, 'Finanč
          INSERT INTO esa2010_codes VALUES (12503, 'Ostatn


          This is documented as being slow (from man pg_dump)




          --inserts Dump data as INSERT commands (rather than COPY). This will make restoration very slow; it is mainly useful for making dumps that can be loaded into non-PostgreSQL databases. However, since this option generates a separate command for each row, an error in reloading a row causes only that row to be lost rather than the entire table contents. Note that the restore might fail altogether if you have rearranged column order. The --column-inserts option is safe against column order changes, though even slower.




          That's why it's so slow. What you're going to want to do is to turn off some of the durability settings, specifically synchronous_commit, though fsync will help too



          You can do this very simply by running the following command before you run your i file.sql.



          SET synchronous_commit TO off;


          That will do a lot to speed it up. Don't forget to turn back on the durability options after you're done. I bet it'll finish in a few hours, after you're set that. If you need more speed though don't hesitate to turn off fsync and full_page_writes on the cluster until you get the data up -- though I won't do it if the DB had data you needed in it, or if it was production. As a last note, if you need the speed and this is a production DB you can go all out on your own copy and dump it with the default options by pg_dump, which you'll be able to load much faster.






          share|improve this answer





























            up vote
            1
            down vote













            Another option is running import in one transaction (if it is possible):



            BEGIN;
            i dump.sql
            COMMIT


            PostgreSQL is runnning in autocommit mode by default - it means every command is finished by commit - and commit is finished by fsync (and fsync is pretty slow). It can be reduced by asunchronnous commit (Evan Carroll's reply) or reduced to one by explicit transaction.



            Other possibility is disabling check of referential integrity (if it is used). This variant is possible, because we can expect so dump is consistent and correct. You can see details to command ALTER TABLE xx DISABLE TRIGGER ALL.



            The source of your file is pg_dump. The most simply speedup can be taken by using some option when dump is created.



            1. Don't use option --inserts. Copy format is significantly faster for restore


            2. Use option --disable-triggers to disable RI check (expect correct data)


            3. You can use custom format -F option. Then you can use pg_restore for restoring and building indexes (most slow operation) parallel.






            share|improve this answer




















              Your Answer







              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "182"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              convertImagesToLinks: false,
              noModals: false,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );













               

              draft saved


              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f216183%2fimporting-large-sql-dump-with-millions-of-insert-statements%23new-answer', 'question_page');

              );

              Post as a guest






























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes








              up vote
              11
              down vote



              accepted










              This dump was dumped as individual statements (with pg_dump --inserts)



              INSERT INTO esa2010_codes VALUES (11002, 'Národn
              INSERT INTO esa2010_codes VALUES (11003, 'Nefina
              INSERT INTO esa2010_codes VALUES (12502, 'Národn
              INSERT INTO esa2010_codes VALUES (11001, 'Verejn
              INSERT INTO esa2010_codes VALUES (12602, 'Národn
              INSERT INTO esa2010_codes VALUES (12603, 'Finanč
              INSERT INTO esa2010_codes VALUES (12503, 'Ostatn


              This is documented as being slow (from man pg_dump)




              --inserts Dump data as INSERT commands (rather than COPY). This will make restoration very slow; it is mainly useful for making dumps that can be loaded into non-PostgreSQL databases. However, since this option generates a separate command for each row, an error in reloading a row causes only that row to be lost rather than the entire table contents. Note that the restore might fail altogether if you have rearranged column order. The --column-inserts option is safe against column order changes, though even slower.




              That's why it's so slow. What you're going to want to do is to turn off some of the durability settings, specifically synchronous_commit, though fsync will help too



              You can do this very simply by running the following command before you run your i file.sql.



              SET synchronous_commit TO off;


              That will do a lot to speed it up. Don't forget to turn back on the durability options after you're done. I bet it'll finish in a few hours, after you're set that. If you need more speed though don't hesitate to turn off fsync and full_page_writes on the cluster until you get the data up -- though I won't do it if the DB had data you needed in it, or if it was production. As a last note, if you need the speed and this is a production DB you can go all out on your own copy and dump it with the default options by pg_dump, which you'll be able to load much faster.






              share|improve this answer


























                up vote
                11
                down vote



                accepted










                This dump was dumped as individual statements (with pg_dump --inserts)



                INSERT INTO esa2010_codes VALUES (11002, 'Národn
                INSERT INTO esa2010_codes VALUES (11003, 'Nefina
                INSERT INTO esa2010_codes VALUES (12502, 'Národn
                INSERT INTO esa2010_codes VALUES (11001, 'Verejn
                INSERT INTO esa2010_codes VALUES (12602, 'Národn
                INSERT INTO esa2010_codes VALUES (12603, 'Finanč
                INSERT INTO esa2010_codes VALUES (12503, 'Ostatn


                This is documented as being slow (from man pg_dump)




                --inserts Dump data as INSERT commands (rather than COPY). This will make restoration very slow; it is mainly useful for making dumps that can be loaded into non-PostgreSQL databases. However, since this option generates a separate command for each row, an error in reloading a row causes only that row to be lost rather than the entire table contents. Note that the restore might fail altogether if you have rearranged column order. The --column-inserts option is safe against column order changes, though even slower.




                That's why it's so slow. What you're going to want to do is to turn off some of the durability settings, specifically synchronous_commit, though fsync will help too



                You can do this very simply by running the following command before you run your i file.sql.



                SET synchronous_commit TO off;


                That will do a lot to speed it up. Don't forget to turn back on the durability options after you're done. I bet it'll finish in a few hours, after you're set that. If you need more speed though don't hesitate to turn off fsync and full_page_writes on the cluster until you get the data up -- though I won't do it if the DB had data you needed in it, or if it was production. As a last note, if you need the speed and this is a production DB you can go all out on your own copy and dump it with the default options by pg_dump, which you'll be able to load much faster.






                share|improve this answer
























                  up vote
                  11
                  down vote



                  accepted







                  up vote
                  11
                  down vote



                  accepted






                  This dump was dumped as individual statements (with pg_dump --inserts)



                  INSERT INTO esa2010_codes VALUES (11002, 'Národn
                  INSERT INTO esa2010_codes VALUES (11003, 'Nefina
                  INSERT INTO esa2010_codes VALUES (12502, 'Národn
                  INSERT INTO esa2010_codes VALUES (11001, 'Verejn
                  INSERT INTO esa2010_codes VALUES (12602, 'Národn
                  INSERT INTO esa2010_codes VALUES (12603, 'Finanč
                  INSERT INTO esa2010_codes VALUES (12503, 'Ostatn


                  This is documented as being slow (from man pg_dump)




                  --inserts Dump data as INSERT commands (rather than COPY). This will make restoration very slow; it is mainly useful for making dumps that can be loaded into non-PostgreSQL databases. However, since this option generates a separate command for each row, an error in reloading a row causes only that row to be lost rather than the entire table contents. Note that the restore might fail altogether if you have rearranged column order. The --column-inserts option is safe against column order changes, though even slower.




                  That's why it's so slow. What you're going to want to do is to turn off some of the durability settings, specifically synchronous_commit, though fsync will help too



                  You can do this very simply by running the following command before you run your i file.sql.



                  SET synchronous_commit TO off;


                  That will do a lot to speed it up. Don't forget to turn back on the durability options after you're done. I bet it'll finish in a few hours, after you're set that. If you need more speed though don't hesitate to turn off fsync and full_page_writes on the cluster until you get the data up -- though I won't do it if the DB had data you needed in it, or if it was production. As a last note, if you need the speed and this is a production DB you can go all out on your own copy and dump it with the default options by pg_dump, which you'll be able to load much faster.






                  share|improve this answer














                  This dump was dumped as individual statements (with pg_dump --inserts)



                  INSERT INTO esa2010_codes VALUES (11002, 'Národn
                  INSERT INTO esa2010_codes VALUES (11003, 'Nefina
                  INSERT INTO esa2010_codes VALUES (12502, 'Národn
                  INSERT INTO esa2010_codes VALUES (11001, 'Verejn
                  INSERT INTO esa2010_codes VALUES (12602, 'Národn
                  INSERT INTO esa2010_codes VALUES (12603, 'Finanč
                  INSERT INTO esa2010_codes VALUES (12503, 'Ostatn


                  This is documented as being slow (from man pg_dump)




                  --inserts Dump data as INSERT commands (rather than COPY). This will make restoration very slow; it is mainly useful for making dumps that can be loaded into non-PostgreSQL databases. However, since this option generates a separate command for each row, an error in reloading a row causes only that row to be lost rather than the entire table contents. Note that the restore might fail altogether if you have rearranged column order. The --column-inserts option is safe against column order changes, though even slower.




                  That's why it's so slow. What you're going to want to do is to turn off some of the durability settings, specifically synchronous_commit, though fsync will help too



                  You can do this very simply by running the following command before you run your i file.sql.



                  SET synchronous_commit TO off;


                  That will do a lot to speed it up. Don't forget to turn back on the durability options after you're done. I bet it'll finish in a few hours, after you're set that. If you need more speed though don't hesitate to turn off fsync and full_page_writes on the cluster until you get the data up -- though I won't do it if the DB had data you needed in it, or if it was production. As a last note, if you need the speed and this is a production DB you can go all out on your own copy and dump it with the default options by pg_dump, which you'll be able to load much faster.







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Aug 29 at 21:45

























                  answered Aug 29 at 10:50









                  Evan Carroll

                  28.4k855184




                  28.4k855184






















                      up vote
                      1
                      down vote













                      Another option is running import in one transaction (if it is possible):



                      BEGIN;
                      i dump.sql
                      COMMIT


                      PostgreSQL is runnning in autocommit mode by default - it means every command is finished by commit - and commit is finished by fsync (and fsync is pretty slow). It can be reduced by asunchronnous commit (Evan Carroll's reply) or reduced to one by explicit transaction.



                      Other possibility is disabling check of referential integrity (if it is used). This variant is possible, because we can expect so dump is consistent and correct. You can see details to command ALTER TABLE xx DISABLE TRIGGER ALL.



                      The source of your file is pg_dump. The most simply speedup can be taken by using some option when dump is created.



                      1. Don't use option --inserts. Copy format is significantly faster for restore


                      2. Use option --disable-triggers to disable RI check (expect correct data)


                      3. You can use custom format -F option. Then you can use pg_restore for restoring and building indexes (most slow operation) parallel.






                      share|improve this answer
























                        up vote
                        1
                        down vote













                        Another option is running import in one transaction (if it is possible):



                        BEGIN;
                        i dump.sql
                        COMMIT


                        PostgreSQL is runnning in autocommit mode by default - it means every command is finished by commit - and commit is finished by fsync (and fsync is pretty slow). It can be reduced by asunchronnous commit (Evan Carroll's reply) or reduced to one by explicit transaction.



                        Other possibility is disabling check of referential integrity (if it is used). This variant is possible, because we can expect so dump is consistent and correct. You can see details to command ALTER TABLE xx DISABLE TRIGGER ALL.



                        The source of your file is pg_dump. The most simply speedup can be taken by using some option when dump is created.



                        1. Don't use option --inserts. Copy format is significantly faster for restore


                        2. Use option --disable-triggers to disable RI check (expect correct data)


                        3. You can use custom format -F option. Then you can use pg_restore for restoring and building indexes (most slow operation) parallel.






                        share|improve this answer






















                          up vote
                          1
                          down vote










                          up vote
                          1
                          down vote









                          Another option is running import in one transaction (if it is possible):



                          BEGIN;
                          i dump.sql
                          COMMIT


                          PostgreSQL is runnning in autocommit mode by default - it means every command is finished by commit - and commit is finished by fsync (and fsync is pretty slow). It can be reduced by asunchronnous commit (Evan Carroll's reply) or reduced to one by explicit transaction.



                          Other possibility is disabling check of referential integrity (if it is used). This variant is possible, because we can expect so dump is consistent and correct. You can see details to command ALTER TABLE xx DISABLE TRIGGER ALL.



                          The source of your file is pg_dump. The most simply speedup can be taken by using some option when dump is created.



                          1. Don't use option --inserts. Copy format is significantly faster for restore


                          2. Use option --disable-triggers to disable RI check (expect correct data)


                          3. You can use custom format -F option. Then you can use pg_restore for restoring and building indexes (most slow operation) parallel.






                          share|improve this answer












                          Another option is running import in one transaction (if it is possible):



                          BEGIN;
                          i dump.sql
                          COMMIT


                          PostgreSQL is runnning in autocommit mode by default - it means every command is finished by commit - and commit is finished by fsync (and fsync is pretty slow). It can be reduced by asunchronnous commit (Evan Carroll's reply) or reduced to one by explicit transaction.



                          Other possibility is disabling check of referential integrity (if it is used). This variant is possible, because we can expect so dump is consistent and correct. You can see details to command ALTER TABLE xx DISABLE TRIGGER ALL.



                          The source of your file is pg_dump. The most simply speedup can be taken by using some option when dump is created.



                          1. Don't use option --inserts. Copy format is significantly faster for restore


                          2. Use option --disable-triggers to disable RI check (expect correct data)


                          3. You can use custom format -F option. Then you can use pg_restore for restoring and building indexes (most slow operation) parallel.







                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Aug 31 at 5:45









                          Pavel Stehule

                          60147




                          60147



























                               

                              draft saved


                              draft discarded















































                               


                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f216183%2fimporting-large-sql-dump-with-millions-of-insert-statements%23new-answer', 'question_page');

                              );

                              Post as a guest













































































                              Comments

                              Popular posts from this blog

                              Long meetings (6-7 hours a day): Being “babysat” by supervisor

                              Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

                              Confectionery