If we using 6 mapper in sqoop to importing the data from Oracle, then how many connection will be establish between sqoop and source
If we using 6 mapper in sqoop to importing the data from Oracle, then how many connection will be establish between sqoop and source.
Will it be a single connection or it will be 6 connections for each mapper.
As per sqoop docs:
Likewise, do not increase the degree of parallism higher than that which your database can reasonably support. Connecting 100 concurrent clients to your database may increase the load on the database server to a point where performance suffers as a result.
That means all the mappers will make concurrent connections.
Also keep in mind, if your table has 2 records only, then sqoop will only use 2 mappers not all the 6 mappers.
Check my other answer to understand concept of number of mappers in Sqoop command.
All the mappers will make inactive connections as JDBC client program. Then active connections (which actually fires SQL query) will be shared among multiple mappers.
Fire SQOOP IMPORT command in -verbose mode, you will see logs -
DEBUG manager.OracleManager$ConnCache: Got cached connection for jdbc:oracle:thin:@192.xx.xx.xx:1521:orcl/dev DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@192.xx.xx.xx:1521:orcl/dev
-m specify the number of mapper task will be running as part of the Job. so more number of mappers then more number of connections.
It probably depends on Manager but I guess all of them likely to create one. Take DirectPostgresSqlManager. It creates one connection per mapper through psql COPY TO STDOUT Please take a look at managers at Sqoop Managers