Coder Social home page Coder Social logo

sonalgoyal / hiho Goto Github PK

View Code? Open in Web Editor NEW
91.0 91.0 31.0 45.91 MB

Hadoop Data Integration with various databases, ftp servers, salesforce. Incremental update, dedup, append, merge your data on Hadoop.

Home Page: www.nubetech.co/products

License: Apache License 2.0

Shell 0.21% Java 99.79%

hiho's Introduction

Hi there ๐Ÿ‘‹

  • ๐Ÿ”ญ Iโ€™m currently working on Zingg
  • ๐ŸŒฑ Iโ€™m writing my thoughts on data and ML at Substack
  • ๐Ÿค Iโ€™m looking to collaborate on data and ml - data quality, data management, metrics, tools...

hiho's People

Contributors

icholy avatar sonalgoyal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hiho's Issues

Provide an ingress route from Salesforce to Hadoop

Broadly, there are 2 use cases -

  1. Full refresh of Salesforce objects into HDFS
  2. Incremental refresh based on some defined criteria

The framework should be compatible with at least the following Hadoop versions - 1.0, 0.2x.

Salesforce API compatibility -
Bulk API (http://www.salesforce.com/us/developer/docs/api_asynch/index.htm)
Replication API (http://www.salesforce.com/us/developer/docs/api/index.htm - "Using the API with Salesforce Features -> Data Replication")

hiho DataDrivenDBInputFormat$DataDrivenDBInputSplit cast error

HI Team ,

I am trying to perform ingest data from mysql db to hdfs

I am getting following error please any pointer

Log Upload Time: Wed May 04 01:54:02 +0530 2016

        Log Length: 105798
      2016-05-04 01:51:51,655 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1462305309616_0002_000001

2016-05-04 01:51:53,344 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-05-04 01:51:53,623 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing with tokens:
2016-05-04 01:51:53,623 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: YARN_AM_RM_TOKEN, Service: , Ident: (org.apache.hadoop.yarn.security.AMRMTokenIdentifier@51eed775)
2016-05-04 01:51:54,187 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred newApiCommitter.
2016-05-04 01:51:56,113 WARN [main] org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
2016-05-04 01:51:56,437 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config null
2016-05-04 01:51:56,714 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output Committer Algorithm version is 1
2016-05-04 01:51:56,723 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
2016-05-04 01:51:56,809 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.jobhistory.EventType for class org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler
2016-05-04 01:51:56,813 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher
2016-05-04 01:51:56,816 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher
2016-05-04 01:51:56,818 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher
2016-05-04 01:51:56,820 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
2016-05-04 01:51:56,835 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.speculate.Speculator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$SpeculatorEventDispatcher
2016-05-04 01:51:56,837 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
2016-05-04 01:51:56,840 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncher$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerLauncherRouter
2016-05-04 01:51:57,016 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://quickstart.cloudera:8020]
2016-05-04 01:51:57,118 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://quickstart.cloudera:8020]
2016-05-04 01:51:57,244 INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://quickstart.cloudera:8020]
2016-05-04 01:51:57,372 INFO [main] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Emitting job history data to the timeline server is not enabled
2016-05-04 01:51:57,675 INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler
2016-05-04 01:51:58,351 WARN [main] org.apache.hadoop.metrics2.impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-mrappmaster.properties,hadoop-metrics2.properties
2016-05-04 01:51:58,901 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2016-05-04 01:51:58,904 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MRAppMaster metrics system started
2016-05-04 01:51:59,039 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Adding job token for job_1462305309616_0002 to jobTokenSecretManager
2016-05-04 01:51:59,873 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Not uberizing job_1462305309616_0002 because: not enabled;
2016-05-04 01:51:59,922 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Input size for job job_1462305309616_0002 = 14. Number of splits = 2
2016-05-04 01:51:59,922 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Number of reduces for job job_1462305309616_0002 = 0
2016-05-04 01:51:59,923 INFO [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1462305309616_0002Job Transitioned from NEW to INITED
2016-05-04 01:51:59,926 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster launching normal, non-uberized, multi-container job job_1462305309616_0002.
2016-05-04 01:52:00,060 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2016-05-04 01:52:00,184 INFO [Socket Reader #1 for port 33700] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 33700
2016-05-04 01:52:00,292 INFO [main] org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.MRClientProtocolPB to the server
2016-05-04 01:52:00,293 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2016-05-04 01:52:00,297 INFO [IPC Server listener on 33700] org.apache.hadoop.ipc.Server: IPC Server listener on 33700: starting
2016-05-04 01:52:00,298 INFO [main] org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Instantiated MRClientService at quickstart.cloudera/127.0.0.1:33700
2016-05-04 01:52:00,645 INFO [main] org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2016-05-04 01:52:00,662 INFO [main] org.apache.hadoop.security.authentication.server.AuthenticationFilter: Unable to initialize FileSignerSecretProvider, falling back to use random secrets.
2016-05-04 01:52:00,676 INFO [main] org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.mapreduce is not defined
2016-05-04 01:52:00,710 INFO [main] org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2016-05-04 01:52:00,750 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context mapreduce
2016-05-04 01:52:00,750 INFO [main] org.apache.hadoop.http.HttpServer2: Added filter AM_PROXY_FILTER (class=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter) to context static
2016-05-04 01:52:00,765 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /mapreduce/*
2016-05-04 01:52:00,765 INFO [main] org.apache.hadoop.http.HttpServer2: adding path spec: /ws/*
2016-05-04 01:52:00,787 INFO [main] org.apache.hadoop.http.HttpServer2: Jetty bound to port 56095
2016-05-04 01:52:00,787 INFO [main] org.mortbay.log: jetty-6.1.26.cloudera.4
2016-05-04 01:52:00,898 INFO [main] org.mortbay.log: Extract jar:file:/usr/jars/hadoop-yarn-common-2.6.0-cdh5.5.0.jar!/webapps/mapreduce to /tmp/Jetty_0_0_0_0_56095_mapreduce____.7bfwc2/webapp
2016-05-04 01:52:03,839 INFO [main] org.mortbay.log: Started [email protected]:56095
2016-05-04 01:52:03,839 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Web app /mapreduce started at 56095
2016-05-04 01:52:05,056 INFO [main] org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules
2016-05-04 01:52:05,064 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: JOB_CREATE job_1462305309616_0002
2016-05-04 01:52:05,075 INFO [main] org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2016-05-04 01:52:05,077 INFO [Socket Reader #1 for port 35130] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 35130
2016-05-04 01:52:05,092 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2016-05-04 01:52:05,092 INFO [IPC Server listener on 35130] org.apache.hadoop.ipc.Server: IPC Server listener on 35130: starting
2016-05-04 01:52:05,154 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: nodeBlacklistingEnabled:true
2016-05-04 01:52:05,155 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: maxTaskFailuresPerNode is 3
2016-05-04 01:52:05,155 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: blacklistDisablePercent is 33
2016-05-04 01:52:05,319 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030
2016-05-04 01:52:05,562 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: maxContainerCapability: <memory:8192, vCores:4>
2016-05-04 01:52:05,562 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: queue: root.cloudera
2016-05-04 01:52:05,572 INFO [main] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Upper limit on the thread pool size is 500
2016-05-04 01:52:05,577 INFO [main] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
2016-05-04 01:52:05,609 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1462305309616_0002Job Transitioned from INITED to SETUP
2016-05-04 01:52:05,625 INFO [CommitterEvent Processor #0] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_SETUP
2016-05-04 01:52:05,696 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1462305309616_0002Job Transitioned from SETUP to RUNNING
2016-05-04 01:52:05,748 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1462305309616_0002_m_000000 Task Transitioned from NEW to SCHEDULED
2016-05-04 01:52:05,750 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1462305309616_0002_m_000001 Task Transitioned from NEW to SCHEDULED
2016-05-04 01:52:05,753 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1462305309616_0002_m_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2016-05-04 01:52:05,754 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1462305309616_0002_m_000001_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2016-05-04 01:52:05,756 INFO [Thread-51] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: mapResourceRequest:<memory:1024, vCores:1>
2016-05-04 01:52:05,905 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer setup for JobId: job_1462305309616_0002, File: hdfs://quickstart.cloudera:8020/tmp/hadoop-yarn/staging/cloudera/.staging/job_1462305309616_0002/job_1462305309616_0002_1.jhist
2016-05-04 01:52:06,575 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:2 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 HostLocal:0 RackLocal:0
2016-05-04 01:52:06,989 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1462305309616_0002: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:6144, vCores:7> knownNMs=1
2016-05-04 01:52:07,371 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://quickstart.cloudera:8020]
2016-05-04 01:52:08,064 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 1
2016-05-04 01:52:08,069 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1462305309616_0002_01_000002 to attempt_1462305309616_0002_m_000000_0
2016-05-04 01:52:08,074 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:1 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:0 RackLocal:0
2016-05-04 01:52:08,281 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-jar file on the remote FS is hdfs://quickstart.cloudera:8020/tmp/hadoop-yarn/staging/cloudera/.staging/job_1462305309616_0002/job.jar
2016-05-04 01:52:08,290 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-conf file on the remote FS is /tmp/hadoop-yarn/staging/cloudera/.staging/job_1462305309616_0002/job.xml
2016-05-04 01:52:08,297 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding #0 tokens and #1 secret keys for NM use for launching container
2016-05-04 01:52:08,297 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Size of containertokens_dob is 1
2016-05-04 01:52:08,297 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Putting shuffle token in serviceData
2016-05-04 01:52:08,362 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapred.JobConf: Task java-opts do not specify heap size. Setting task attempt jvm max heap size to -Xmx820m
2016-05-04 01:52:08,369 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1462305309616_0002_m_000000_0 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED
2016-05-04 01:52:08,390 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1462305309616_0002_01_000002 taskAttempt attempt_1462305309616_0002_m_000000_0
2016-05-04 01:52:08,400 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1462305309616_0002_m_000000_0
2016-05-04 01:52:08,403 INFO [ContainerLauncher #0] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: Opening proxy : quickstart.cloudera:44830
2016-05-04 01:52:08,638 INFO [ContainerLauncher #0] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Shuffle port returned by ContainerManager for attempt_1462305309616_0002_m_000000_0 : 13562
2016-05-04 01:52:08,641 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1462305309616_0002_m_000000_0] using containerId: [container_1462305309616_0002_01_000002 on NM: [quickstart.cloudera:44830]
2016-05-04 01:52:08,654 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1462305309616_0002_m_000000_0 TaskAttempt Transitioned from ASSIGNED to RUNNING
2016-05-04 01:52:08,656 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: ATTEMPT_START task_1462305309616_0002_m_000000
2016-05-04 01:52:08,656 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1462305309616_0002_m_000000 Task Transitioned from SCHEDULED to RUNNING
2016-05-04 01:52:09,090 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1462305309616_0002: ask=1 release= 0 newContainers=1 finishedContainers=0 resourcelimit=<memory:4096, vCores:5> knownNMs=1
2016-05-04 01:52:09,090 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 1
2016-05-04 01:52:09,091 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1462305309616_0002_01_000003 to attempt_1462305309616_0002_m_000001_0
2016-05-04 01:52:09,091 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:2 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:2 ContRel:0 HostLocal:0 RackLocal:0
2016-05-04 01:52:09,092 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapred.JobConf: Task java-opts do not specify heap size. Setting task attempt jvm max heap size to -Xmx820m
2016-05-04 01:52:09,094 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1462305309616_0002_m_000001_0 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED
2016-05-04 01:52:09,098 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1462305309616_0002_01_000003 taskAttempt attempt_1462305309616_0002_m_000001_0
2016-05-04 01:52:09,098 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1462305309616_0002_m_000001_0
2016-05-04 01:52:09,103 INFO [ContainerLauncher #1] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: Opening proxy : quickstart.cloudera:44830
2016-05-04 01:52:09,166 INFO [ContainerLauncher #1] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Shuffle port returned by ContainerManager for attempt_1462305309616_0002_m_000001_0 : 13562
2016-05-04 01:52:09,167 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1462305309616_0002_m_000001_0] using containerId: [container_1462305309616_0002_01_000003 on NM: [quickstart.cloudera:44830]
2016-05-04 01:52:09,168 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1462305309616_0002_m_000001_0 TaskAttempt Transitioned from ASSIGNED to RUNNING
2016-05-04 01:52:09,168 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: ATTEMPT_START task_1462305309616_0002_m_000001
2016-05-04 01:52:09,169 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1462305309616_0002_m_000001 Task Transitioned from SCHEDULED to RUNNING
2016-05-04 01:52:10,101 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1462305309616_0002: ask=1 release= 0 newContainers=1 finishedContainers=0 resourcelimit=<memory:3072, vCores:4> knownNMs=1
2016-05-04 01:52:10,101 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 1
2016-05-04 01:52:10,102 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Cannot assign container Container: [ContainerId: container_1462305309616_0002_01_000004, NodeId: quickstart.cloudera:44830, NodeHttpAddress: quickstart.cloudera:8042, Resource: <memory:1024, vCores:1>, Priority: 20, Token: Token { kind: ContainerToken, service: 127.0.0.1:44830 }, ] for a map as either container memory less than required <memory:1024, vCores:1> or no pending map tasks - maps.isEmpty=true
2016-05-04 01:52:10,102 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:2 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:3 ContRel:1 HostLocal:0 RackLocal:0
2016-05-04 01:52:11,113 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1462305309616_0002: ask=0 release= 1 newContainers=0 finishedContainers=0 resourcelimit=<memory:4096, vCores:5> knownNMs=1
2016-05-04 01:52:12,148 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_1462305309616_0002_01_000004
2016-05-04 01:52:12,148 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container complete event for unknown container id container_1462305309616_0002_01_000004
2016-05-04 01:52:19,026 INFO [Socket Reader #1 for port 35130] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1462305309616_0002 (auth:SIMPLE)
2016-05-04 01:52:19,123 INFO [IPC Server handler 29 on 35130] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : jvm_1462305309616_0002_m_000002 asked for a task
2016-05-04 01:52:19,124 INFO [IPC Server handler 29 on 35130] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: jvm_1462305309616_0002_m_000002 given task: attempt_1462305309616_0002_m_000000_0
2016-05-04 01:52:19,375 INFO [Socket Reader #1 for port 35130] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1462305309616_0002 (auth:SIMPLE)
2016-05-04 01:52:19,467 INFO [IPC Server handler 29 on 35130] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : jvm_1462305309616_0002_m_000003 asked for a task
2016-05-04 01:52:19,467 INFO [IPC Server handler 29 on 35130] org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: jvm_1462305309616_0002_m_000003 given task: attempt_1462305309616_0002_m_000001_0
2016-05-04 01:52:32,149 INFO [Socket Reader #1 for port 35130] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1462305309616_0002 (auth:SIMPLE)
2016-05-04 01:52:32,250 INFO [Socket Reader #1 for port 35130] SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for job_1462305309616_0002 (auth:SIMPLE)
2016-05-04 01:52:33,093 INFO [IPC Server handler 20 on 35130] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1462305309616_0002_m_000000_0 is : 0.0
2016-05-04 01:52:33,194 FATAL [IPC Server handler 15 on 35130] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1462305309616_0002_m_000000_0 - exited : java.lang.ClassCastException: co.nubetech.hiho.mapreduce.lib.db.apache.DBInputFormat$DBInputSplit cannot be cast to co.nubetech.hiho.mapreduce.lib.db.apache.DataDrivenDBInputFormat$DataDrivenDBInputSplit
at co.nubetech.hiho.mapreduce.lib.db.apache.DataDrivenDBRecordReader.getSelectQuery(DataDrivenDBRecordReader.java:80)
at co.nubetech.hiho.mapreduce.lib.db.DBQueryRecordReader.nextKeyValue(DBQueryRecordReader.java:99)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.