50 universal truths that will make you more successful

The following are from an excellent article by Julie Bort published on National Post (October 29, 2013).

  1. Have a passion for your work. If your work is meaningful to you, your work life will be a joy.
  2. If you can’t be passionate about the work itself, be passionate about the reason you do it. Maybe you don’t love your job/company/career, but the money and benefits are good for your family. Be passionate in your choice to do right by your family.
  3. If something needs changing, be the one to lead the change. If you dislike your job but are stuck, work on getting the skills that will get you unstuck. If there’s a problem at your office, work on being the one solve it.
  4. Start small and build from there.
  5. Do the obvious stuff first, then progress to the harder stuff. (Otherwise known as going for the low-hanging fruit.)
  6. If it’s not broke, don’t fix. Do improve it.
  7. The hardest lesson to learn is when to keep going and when to quit. No one can teach you that. At some point, you have to choose.
  8. The definition of crazy is to do the same thing the same way and expect a different result. If the result isn’t good, change something.
  9. No one succeeds alone.
  10. Ask for help. Be specific when asking. Be graceful and grateful when help comes.
  11. Surround yourself with positive people and you’ll have a positive outcome.
  12. Embrace diversity. The best way to compensate for your own weaknesses is to pick teammates who have different strengths.
  13. People experience the world differently. Two people can attend the same meeting and walk away with different impressions. Don’t fight that. Use it.
  14. You don’t have to like someone to treat that person with respect and courtesy.
  15. Don’t “should” all over someone, and don’t let someone else “should” all over you.
  16. No matter what you do or how much you achieve, there are always people who have more.
  17. There will always people who have less, too.
  18. No matter how much you excel at things, you are not a more worthwhile human being than anyone else. No one else is more worthwhile than you, either.
  19. If you spend most of your time using your talents and doing things you are good at, you’re more likely to be happy.
  20. If you spend most of your time struggling to improve your weaknesses, you’re likely to be frustrated.
  21. Practice is the only true way to master a new skill. Be patient with yourself while you learn something new.
  22. The only way to stay fresh is to keep learning new things.
  23. To learn new things means being a beginner, and that means making mistakes.
  24. The more comfortable you grow with making beginner mistakes, the easier it is to learn new things.
  25. You will never have all the resources (time, money, people, etc.) that you want for your project or company. No one ever has all the resources they want.
  26. A lack of resources isn’t an excuse. It’s a blessing in disguise. You’ll have to get creative.
  27. Creativity and innovation are skills that can be learned and practiced by doing your usual things in a new way.
  28. Take calculated risks.
  29. In the early stages of a company, career, or project, you’ll have to say “yes” to a lot of things. In the later stages, you’ll have to say “no.”
  30. Negative feedback is necessary. Don’t automatically reject it. Examine it for the nuggets of truth, and then disregard the rest.
  31. When delivering criticism, talk about the work, not the person.
  32. Think big. Dream big. (The alternative is to think small, dream small.)
  33. Treat your dream as an ultimate roadmap. You don’t have to achieve your dream right away, but the only way to get there is to take many steps toward it.
  34. If you think big, you will hear “no” more than you hear “yes.” They don’t get to decide. You do.
  35. How long it takes you to create something is less important than how valuable and worthwhile it will be once it’s created.
  36. If there is one secret to success, it’s this: communicate your plans with other people and keep communicating those plans.
  37. Grow your network. Make an effort to meet new people and to keep in contact with those you know.
  38. No matter what technology or service you are creating/inventing at your company, it’s not about the product; it’s always about the people and the lives you will improve.
  39. No matter how successful you get, you can still fail and fail big.
  40. Failure isn’t a bad thing. It’s part of the process.
  41. Things always go wrong. The only way to keep that from hurting you is to plan for that.
  42. Learn how to respectfully, but firmly, say “no.”
  43. Say “yes” as much as you can.
  44. In order to say “yes” often, attach boundaries or a scope of work around your “yes.”
  45. No matter how rich, famous, or successful another person is, inside that person is just a human being with hopes, dreams, and fears, the same as you.
  46. Getting what you want doesn’t mean you’ll be happy. Happiness is the art of being satisfied with what you already have.
  47. Working with difficult personalities will be a part of every job. Be respectful, do your job well, and nine times out of 10 that person will move on.
  48. For that one-out-of-10 time, remember you aren’t a victim. Do what you need to get a new job.
  49. As soon as you have something to demonstrate, get an executive champion to back or support your project.
  50. Focus on what you want, not what you don’t want.

Hadoop 3: Writing a MapReduce program

MapReduce is a programming framework to take a specification of how the data will be input and output from its two stages (map and reduce) and apply it across large amount of data stored distributedly on HDFS.
The tutorial of MapReduce programming is here . I will only provide the steps to compile the Java program and execute it in Hadoop as the following:

-- compile
mkdir wordcount_classes
javac -classpath ${HADOOP_HOME}/hadoop-core-1.2.1.jar -d wordcount_classes WordCount.java

-- create a jar frrom wordcount_classes
jar -cvf /home/hadoop/test/wordcount.jar -C wordcount_classes/ .

-- start hadoop dfs / mapred
start-dfs.sh
start-mapred.sh

[hadoop@localhost test]$ hadoop dfs -ls
Warning: $HADOOP_HOME is deprecated.

Found 1 items
drwxr-xr-x   - hadoop supergroup          0 2013-09-25 12:54 /user/hadoop/test

[hadoop@localhost test]$ hadoop dfs -mkdir test/wordcount
Warning: $HADOOP_HOME is deprecated.

[hadoop@localhost test]$ hadoop dfs -mkdir test/wordcount/input
Warning: $HADOOP_HOME is deprecated.

[hadoop@localhost test]$ hadoop dfs -ls
Warning: $HADOOP_HOME is deprecated.

Found 1 items
drwxr-xr-x   - hadoop supergroup          0 2013-09-26 12:08 /user/hadoop/test

[hadoop@localhost test]$ hadoop dfs -ls /user/hadoop/test/wordcount/input
Warning: $HADOOP_HOME is deprecated.

[hadoop@localhost test]$ hadoop dfs -copyFromLocal file01 /user/hadoop/test/wordcount/input
Warning: $HADOOP_HOME is deprecated.

[hadoop@localhost test]$ hadoop dfs -ls /user/hadoop/test/wordcount/input
Warning: $HADOOP_HOME is deprecated.

Found 1 items
-rw-r--r--   1 hadoop supergroup         22 2013-09-26 12:14 /user/hadoop/test/wordcount/input/file01
[hadoop@localhost test]$ hadoop dfs -copyFromLocal file02 /user/hadoop/test/wordcount/input
Warning: $HADOOP_HOME is deprecated.

[hadoop@localhost test]$ hadoop dfs -ls /user/hadoop/test/wordcount/input
Warning: $HADOOP_HOME is deprecated.

Found 2 items
-rw-r--r--   1 hadoop supergroup         22 2013-09-26 12:14 /user/hadoop/test/wordcount/input/file01
-rw-r--r--   1 hadoop supergroup         28 2013-09-26 12:14 /user/hadoop/test/wordcount/input/file02

[hadoop@localhost test]$ hadoop dfs -rmr /user/hadoop/test/wordcount/output
Warning: $HADOOP_HOME is deprecated.

Deleted hdfs://localhost:9000/user/hadoop/test/wordcount/output
[hadoop@localhost test]$ hadoop jar wordcount.jar WordCount  /user/hadoop/test/wordcount/input  /user/hadoop/test/wordcount/output
Warning: $HADOOP_HOME is deprecated.

13/09/26 12:19:30 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/09/26 12:19:30 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/09/26 12:19:30 WARN snappy.LoadSnappy: Snappy native library not loaded
13/09/26 12:19:30 INFO mapred.FileInputFormat: Total input paths to process : 2
13/09/26 12:19:30 INFO mapred.JobClient: Running job: job_201309261202_0002
13/09/26 12:19:31 INFO mapred.JobClient:  map 0% reduce 0%
13/09/26 12:19:45 INFO mapred.JobClient:  map 33% reduce 0%
13/09/26 12:19:46 INFO mapred.JobClient:  map 66% reduce 0%
13/09/26 12:19:50 INFO mapred.JobClient:  map 100% reduce 0%
13/09/26 12:19:56 INFO mapred.JobClient:  map 100% reduce 33%
13/09/26 12:19:57 INFO mapred.JobClient:  map 100% reduce 100%
13/09/26 12:19:58 INFO mapred.JobClient: Job complete: job_201309261202_0002
13/09/26 12:19:58 INFO mapred.JobClient: Counters: 30
13/09/26 12:19:58 INFO mapred.JobClient:   Job Counters
13/09/26 12:19:58 INFO mapred.JobClient:     Launched reduce tasks=1
13/09/26 12:19:58 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=27861
13/09/26 12:19:58 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/09/26 12:19:58 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/09/26 12:19:58 INFO mapred.JobClient:     Launched map tasks=3
13/09/26 12:19:58 INFO mapred.JobClient:     Data-local map tasks=3
13/09/26 12:19:58 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=12045
13/09/26 12:19:58 INFO mapred.JobClient:   File Input Format Counters
13/09/26 12:19:58 INFO mapred.JobClient:     Bytes Read=53
13/09/26 12:19:58 INFO mapred.JobClient:   File Output Format Counters
13/09/26 12:19:58 INFO mapred.JobClient:     Bytes Written=41
13/09/26 12:19:58 INFO mapred.JobClient:   FileSystemCounters
13/09/26 12:19:58 INFO mapred.JobClient:     FILE_BYTES_READ=79
13/09/26 12:19:58 INFO mapred.JobClient:     HDFS_BYTES_READ=395
13/09/26 12:19:58 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=224903
13/09/26 12:19:58 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=41
13/09/26 12:19:58 INFO mapred.JobClient:   Map-Reduce Framework
13/09/26 12:19:58 INFO mapred.JobClient:     Map output materialized bytes=91
13/09/26 12:19:58 INFO mapred.JobClient:     Map input records=2
13/09/26 12:19:58 INFO mapred.JobClient:     Reduce shuffle bytes=91
13/09/26 12:19:58 INFO mapred.JobClient:     Spilled Records=12
13/09/26 12:19:58 INFO mapred.JobClient:     Map output bytes=82
13/09/26 12:19:58 INFO mapred.JobClient:     Total committed heap usage (bytes)=445067264
13/09/26 12:19:58 INFO mapred.JobClient:     CPU time spent (ms)=1600
13/09/26 12:19:58 INFO mapred.JobClient:     Map input bytes=50
13/09/26 12:19:58 INFO mapred.JobClient:     SPLIT_RAW_BYTES=342
13/09/26 12:19:58 INFO mapred.JobClient:     Combine input records=8
13/09/26 12:19:58 INFO mapred.JobClient:     Reduce input records=6
13/09/26 12:19:58 INFO mapred.JobClient:     Reduce input groups=5
13/09/26 12:19:58 INFO mapred.JobClient:     Combine output records=6
13/09/26 12:19:58 INFO mapred.JobClient:     Physical memory (bytes) snapshot=470908928
13/09/26 12:19:58 INFO mapred.JobClient:     Reduce output records=5
13/09/26 12:19:58 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=1494261760
13/09/26 12:19:58 INFO mapred.JobClient:     Map output records=8
[hadoop@localhost test]$ hadoop dfs -ls /user/hadoop/test/wordcount/output
Warning: $HADOOP_HOME is deprecated.

Found 3 items
-rw-r--r--   1 hadoop supergroup          0 2013-09-26 12:19 /user/hadoop/test/wordcount/output/_SUCCESS
drwxr-xr-x   - hadoop supergroup          0 2013-09-26 12:19 /user/hadoop/test/wordcount/output/_logs
-rw-r--r--   1 hadoop supergroup         41 2013-09-26 12:19 /user/hadoop/test/wordcount/output/part-00000
[hadoop@localhost test]$ hadoop dfs -cat /user/hadoop/test/wordcount/output/part-00000
Warning: $HADOOP_HOME is deprecated.

Bye     1
Goodbye 1
Hadoop  2
Hello   2
World   2
[hadoop@localhost test]$

Hadoop 2: Configure Pseudo-distributed mode

In the last post, I show how to get start with Apache Hadoop by installing the software, and testing the pi program in the default local standalone mode.  However, Hadoop is more for writing data intensive distributed application, and it intends to run in distributed mode. The following will show how to configure Hadoop to run in Pseudo-distributed mode. Although it doesn’t run on the full  distributed mode, it can demonstrate how it works using HDFS.

1.  Setup SSH

1) create a new OpenSSL key pair with empty passphrase


[hadoop@localhost ~]$ ssh-keygen

Generating public/private rsa key pair.

Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):

Created directory '/home/hadoop/.ssh'.

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /home/hadoop/.ssh/id_rsa.

Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.

The key fingerprint is:

88:49:13:47:88:55:60:d9:c4:1f:54:f4:6c:f2:c9:f2 hadoop@localhost.localdomain

2) copy the new public key to the list of authorized keys


[hadoop@localhost ~]$ ls .ssh

id_rsa id_rsa.pub

[hadoop@localhost ~]$ cp .ssh/id_rsa.pub .ssh/authorized_keys

3) connect to local host


[hadoop@localhost ~]$ ssh localhost

The authenticity of host 'localhost (127.0.0.1)' can't be established.

RSA key fingerprint is 07:15:bb:a2:a6:ba:60:3f:c3:31:a9:c9:4a:7c:51:6a.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'localhost' (RSA) to the list of known hosts.

Last login: Wed Sep 25 11:49:51 2013

[hadoop@localhost ~]$ exit

logout

4) Confirm that the password-less SSH is working


[hadoop@localhost ~]$ ssh localhost

Last login: Wed Sep 25 12:20:06 2013 from localhost.localdomain

2. Configure Pseudo-distributed mode

1) gedit $HADOOP_HOME/conf/core-site.xml


<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>

<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>

</property>

<property>

<name>hadoop.tmp.dir</name>

<value>/home/hadoop/datadir</value>

</property>

</configuration>

2) gedit $HADOOP_HOME/conf/hdfs-site.xml


<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>

<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>

</property>

<property>

<name>hadoop.tmp.dir</name>

<value>/home/hadoop/datadir</value>

</property>

</configuration>

3) gedit $HADOOP_HOME/conf/mapred-site.xml


<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>

<name>mapred.job.tracker</name>

<value>localhost:9001</value>

</property>

</configuration>

4) Create the base directory for Hadoop files


[hadoop@localhost ~]$ pwd

/home/hadoop

[hadoop@localhost ~]$ mkdir datadir

[hadoop@localhost ~]$ ls -l /home/hadoop

total 37268

drwxrwxr-x 2 hadoop hadoop 4096 Sep 25 12:35 datadir

drwxr-xr-x 2 hadoop hadoop 4096 Sep 13 08:17 Desktop

drwxr-xr-x 14 hadoop hadoop 4096 Sep 13 10:23 hadoop-1.2.1

5) Format the HDFS filesystem


[hadoop@localhost ~]$ hadoop namenode -format

Warning: $HADOOP_HOME is deprecated.

13/09/25 12:43:53 INFO namenode.NameNode: STARTUP_MSG:

/************************************************************

STARTUP_MSG: Starting NameNode

STARTUP_MSG: host = localhost.localdomain/127.0.0.1

STARTUP_MSG: args = [-format]

STARTUP_MSG: version = 1.2.1

STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013

STARTUP_MSG: java = 1.6.0_20

************************************************************/

13/09/25 12:43:53 INFO util.GSet: Computing capacity for map BlocksMap

13/09/25 12:43:53 INFO util.GSet: VM type = 32-bit

13/09/25 12:43:53 INFO util.GSet: 2.0% max memory = 1013645312

13/09/25 12:43:53 INFO util.GSet: capacity = 2^22 = 4194304 entries

13/09/25 12:43:53 INFO util.GSet: recommended=4194304, actual=4194304

13/09/25 12:43:54 INFO namenode.FSNamesystem: fsOwner=hadoop

13/09/25 12:43:54 INFO namenode.FSNamesystem: supergroup=supergroup

13/09/25 12:43:54 INFO namenode.FSNamesystem: isPermissionEnabled=true

13/09/25 12:43:54 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100

13/09/25 12:43:54 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)

13/09/25 12:43:54 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0

13/09/25 12:43:54 INFO namenode.NameNode: Caching file names occuring more than 10 times

13/09/25 12:43:56 INFO common.Storage: Image file /home/hadoop/datadir/dfs/name/current/fsimage of size 112 bytes saved in 0 seconds.

13/09/25 12:43:56 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/home/hadoop/datadir/dfs/name/current/edits

13/09/25 12:43:56 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/home/hadoop/datadir/dfs/name/current/edits

13/09/25 12:43:57 INFO common.Storage: Storage directory /home/hadoop/datadir/dfs/name has been successfully formatted.

13/09/25 12:43:57 INFO namenode.NameNode: SHUTDOWN_MSG:

/************************************************************

SHUTDOWN_MSG: Shutting down NameNode at localhost.localdomain/127.0.0.1

************************************************************/

6) start DFS


[hadoop@localhost ~]$ start-dfs.sh

Warning: $HADOOP_HOME is deprecated.

starting namenode, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-namenode-localhost.localdomain.out

localhost: Warning: $HADOOP_HOME is deprecated.

localhost:

localhost: starting datanode, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-datanode-localhost.localdomain.out

localhost: Warning: $HADOOP_HOME is deprecated.

localhost:

localhost: starting secondarynamenode, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-secondarynamenode-localhost.localdomain.out

[hadoop@localhost ~]$ jps

4346 Jps

4018 NameNode

4258 SecondaryNameNode

4136 DataNode

[hadoop@localhost ~]$ hadoop dfs -mkdir test

Warning: $HADOOP_HOME is deprecated.

[hadoop@localhost ~]$ hadoop dfs -ls /

Warning: $HADOOP_HOME is deprecated.

Found 1 items

drwxr-xr-x - hadoop supergroup 0 2013-09-25 12:54 /user

[hadoop@localhost ~]$ hadoop dfs -ls /user/hadoop

Warning: $HADOOP_HOME is deprecated.

Found 1 items

drwxr-xr-x - hadoop supergroup 0 2013-09-25 12:54 /user/hadoop/test

7) start MAPRED


[hadoop@localhost ~]$ start-mapred.sh

Warning: $HADOOP_HOME is deprecated.

starting jobtracker, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-jobtracker-localhost.localdomain.out

localhost: Warning: $HADOOP_HOME is deprecated.

localhost:

localhost: starting tasktracker, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-localhost.localdomain.out

[hadoop@localhost ~]$ jps

4887 JobTracker

5102 Jps

4018 NameNode

4258 SecondaryNameNode

4136 DataNode

5010 TaskTracker

8) stop ALL


[hadoop@localhost ~]$ stop-all.sh

Warning: $HADOOP_HOME is deprecated.

stopping jobtracker

localhost: Warning: $HADOOP_HOME is deprecated.

localhost:

localhost: stopping tasktracker

stopping namenode

localhost: stopping datanode

localhost: Warning: $HADOOP_HOME is deprecated.

localhost:

localhost: stopping secondarynamenode

localhost: Warning: $HADOOP_HOME is deprecated.

localhost:

9) start ALL


[hadoop@localhost ~]$ start-all.sh

Warning: $HADOOP_HOME is deprecated.

starting namenode, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-namenode-localhost.localdomain.out

localhost: Warning: $HADOOP_HOME is deprecated.

localhost:

localhost: starting datanode, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-datanode-localhost.localdomain.out

localhost: Warning: $HADOOP_HOME is deprecated.

localhost:

localhost: starting secondarynamenode, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-secondarynamenode-localhost.localdomain.out

starting jobtracker, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-jobtracker-localhost.localdomain.out

localhost: Warning: $HADOOP_HOME is deprecated.

localhost:

localhost: starting tasktracker, logging to /home/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-hadoop-tasktracker-localhost.localdomain.out

[hadoop@localhost ~]$ jps

5883 JobTracker

5530 NameNode

6028 TaskTracker

5654 DataNode

6123 Jps

5801 SecondaryNameNode

10) Web based admin interface

Web based Interface for NameNode
http://localhost:50070
Web based Interface for JobTracker
http://localhost:50030
Web based Interface for TaskTracker
http://localhost:50060