Apache Spark

Spark Metrics in CDAP
http://docs.cask.co/cdap/3.6.0/en/developers-manual/building-blocks/spark-programs.html

Notes:
Configuration: http://spark.apache.org/docs/latest/configuration.html
http://spark.apache.org/docs/latest/monitoring.html
http://spark.apache.org/docs/latest/streaming-programming-guide.html
http://spark.apache.org/docs/latest/api/java/index.html

Read CSV file from remote system:
http://stackoverflow.com/questions/34479895/read-csv-file-in-apache-spark-from-remote-location-ftp

https://github.com/databricks/learning-spark/blob/master/src/main/scala/com/oreilly/learningsparkexamples/scala/BasicLoadTextFromFTP.scala

http://octuplus.co/Detalles/27862/Can’t-read-files-via-FTP-using-SparkContext-textFile(—)-on-Google-Dataproc

To test with FTP, setup test FTP server with
https://mina.apache.org/ftpserver-project/managing_users.html

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s