Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mapreduce.DefaultVisibilityExpressionResolver: Error scanning 'labels' table #4

Open
davidonlaptop opened this issue Aug 22, 2015 · 26 comments

Comments

@davidonlaptop
Copy link
Member

HBase complains about a missing table when importing data using ImportTsv.

015-08-22 08:46:35,470 ERROR [LocalJobRunner Map Task Executor #0] mapreduce.DefaultVisibilityExpressionResolver: Error scanning 'labels' table
org.apache.hadoop.hbase.TableNotFoundException: Table 'hbase:labels' was not found, got: abcd2.
    at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1274)
    at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1155)
    at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:300)
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:153)
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:61)
    at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
    at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
    at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:295)
    at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:160)
    at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:155)
    at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:811)
    at org.apache.hadoop.hbase.mapreduce.DefaultVisibilityExpressionResolver.init(DefaultVisibilityExpressionResolver.java:90)
    at org.apache.hadoop.hbase.mapreduce.CellCreator.<init>(CellCreator.java:48)
    at org.apache.hadoop.hbase.mapreduce.TsvImporterMapper.setup(TsvImporterMapper.java:107)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
@Mahesha999
Copy link

did you get the workaround? I was also getting the same.

@startprogress
Copy link

How to fix this?

@davidonlaptop
Copy link
Member Author

Damn, can't remember right now... !

Apparently, you're not the only one:
https://issues.apache.org/jira/browse/HBASE-14365

@jean-marc, have you seen this before ?

On Fri, Oct 7, 2016 at 10:42 PM, Andrew Zhang [email protected]
wrote:

How to fix this?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#4 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAAqbI6xXODeKoQiaxece2rxTsFan7v8ks5qxwMTgaJpZM4FwUq8
.

@davidonlaptop
Copy link
Member Author

Do you have the full stack trace?

2016-10-08 1:00 GMT-04:00 David Lauzon [email protected]:

Damn, can't remember right now... !

Apparently, you're not the only one:
https://issues.apache.org/jira/browse/HBASE-14365

@jean-marc, have you seen this before ?

On Fri, Oct 7, 2016 at 10:42 PM, Andrew Zhang [email protected]
wrote:

How to fix this?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#4 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAAqbI6xXODeKoQiaxece2rxTsFan7v8ks5qxwMTgaJpZM4FwUq8
.

@startprogress
Copy link

ya, as follows:

2016-10-07 11:34:53,296 INFO [main-SendThread(aiszk1.boloomo.com:2181)] org.apache.zookeeper.ClientCnxn: Opening socket connection to server aiszk1.boloomo.com/192.168.30.113:2181. Will not attempt to authenticate using SASL (unknown error)
2016-10-07 11:34:53,497 INFO [main-SendThread(aiszk1.boloomo.com:2181)] org.apache.zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.30.121:50440, server: aiszk1.boloomo.com/192.168.30.113:2181
2016-10-07 11:34:53,708 INFO [main-SendThread(aiszk1.boloomo.com:2181)] org.apache.zookeeper.ClientCnxn: Session establishment complete on server aiszk1.boloomo.com/192.168.30.113:2181, sessionid = 0x25789e819482541, negotiated timeout = 60000
2016-10-07 11:35:05,294 ERROR [main] org.apache.hadoop.hbase.mapreduce.DefaultVisibilityExpressionResolver: Error scanning 'labels' table
org.apache.hadoop.hbase.TableNotFoundException: hbase:labels
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1404)
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1199)
        at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305)
        at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:156)
        at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60)
        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
        at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:295)
        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:160)
        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:155)
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:867)
        at org.apache.hadoop.hbase.mapreduce.DefaultVisibilityExpressionResolver.init(DefaultVisibilityExpressionResolver.java:91)
        at org.apache.hadoop.hbase.mapreduce.CellCreator.<init>(CellCreator.java:48)
        at org.apache.hadoop.hbase.mapreduce.TsvImporterMapper.setup(TsvImporterMapper.java:108)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

@davidonlaptop
Copy link
Member Author

Trying again to tag Jean-Marc into discussion: @jmspaggi

@davidonlaptop
Copy link
Member Author

Hi @startprogress, does it work in spite of the error message?

If not, could you post a step-by-step to reproduce your problem?

@jean-marc
Copy link

I am afraid you have the wrong Jean-Marc

jm

On Sat, Oct 8, 2016 at 11:33 AM, David Lauzon [email protected]
wrote:

Trying again to tag Jean-Marc into discussion: @jmspaggi
https://github.com/jmspaggi


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#4 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAODFWHOfHWcA7M0AWIJv9HwP-AT6ci2ks5qx-ISgaJpZM4FwUq8
.

@jmspaggi
Copy link

jmspaggi commented Oct 8, 2016

Worked ;)

Is this stacktrace causing any issue? What HBase version are you trying? As David pointed before on JIRA, if cell level security is not used, this "ERROR" can be considered as an INFO or a WARNING...

Is HBase stilll working well after that?

JMS

@davidonlaptop
Copy link
Member Author

I've run a simple test which throws the error, but works anyhow:

Importing the CSV:

cat <<EOF >> /tmp/simple.csv
a,b,c
1,2,4
5,6,8
EOF

hbase shell create 'simpletable', 'cf'

hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,cf:b,cf:c '-Dimporttsv.separator=,' simpletable /tmp/simple.csv

Validating that the data is there:

root@hbase-shell:/# hbase shell
2016-10-08 18:41:34,140 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.1.1, rd0a115a7267f54e01c72c603ec53e91ec418292f, Tue Jun 23 14:44:07 PDT 2015

hbase(main):001:0> scan 'simpletable'
ROW                                                   COLUMN+CELL
 1                                                    column=cf:b, timestamp=1475951843015, value=2
 1                                                    column=cf:c, timestamp=1475951843015, value=4
 5                                                    column=cf:b, timestamp=1475951843015, value=6
 5                                                    column=cf:c, timestamp=1475951843015, value=8
 a                                                    column=cf:b, timestamp=1475951843015, value=b
 a                                                    column=cf:c, timestamp=1475951843015, value=c
3 row(s) in 0.1780 seconds

@startprogress: could you confirm on your side?

@startprogress
Copy link

@davidonlaptop
I've confirmed. This test did throw the error, but worked anyhow. Here is the output of scan:

hbase(main):001:0> scan 'simpletable'
ROW                                      COLUMN+CELL                                                                                                        
 1                                       column=cf:b, timestamp=1475978024254, value=2                                                                      
 1                                       column=cf:c, timestamp=1475978024254, value=3                                                                      
 4                                       column=cf:b, timestamp=1475978024254, value=5                                                                      
 4                                       column=cf:c, timestamp=1475978024254, value=6                                                                      
 a                                       column=cf:b, timestamp=1475978024254, value=b                                                                      
 a                                       column=cf:c, timestamp=1475978024254, value=c                                                                      
3 row(s) in 0.5620 seconds

But with my own data(about 140GB), after the course of Importtsv, the hbase table is still empty. So I think some other errors might exist and they were not logged thus hard to be found out.

@jmspaggi
Copy link

jmspaggi commented Oct 9, 2016

Nothing else on the logs? What does your csv file looks like?

2016-10-08 22:00 GMT-04:00 Andrew Zhang [email protected]:

@davidonlaptop https://github.com/davidonlaptop
I've confirmed. This test did throw the error, but worked anyhow. Here is
the output of scan:

hbase(main):001:0> scan 'simpletable'
ROW COLUMN+CELL
1 column=cf:b, timestamp=1475978024254, value=2
1 column=cf:c, timestamp=1475978024254, value=3
4 column=cf:b, timestamp=1475978024254, value=5
4 column=cf:c, timestamp=1475978024254, value=6
a column=cf:b, timestamp=1475978024254, value=b
a column=cf:c, timestamp=1475978024254, value=c
3 row(s) in 0.5620 seconds

But with my own data(about 140GB), after the course of Importtsv, the
hbase table is still empty. So I think some other errors might exist and
they were not logged thus hard to be found out.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#4 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AC4MY6W3INt97FN-7Yg20wcfDLMZLEmjks5qyErOgaJpZM4FwUq8
.

@startprogress
Copy link

@jmspaggi
with grep 'Exception', I got three types of logs:

org.apache.hadoop.hbase.TableNotFoundException: hbase:labels
2016-10-07 11:39:30,234 WARN [communication thread] org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: Error reading the stream java.io.IOException: No such process
java.net.SocketTimeoutException: Read timed out

my file is tsv file, each field is separated by tab.

@startprogress
Copy link

startprogress commented Oct 9, 2016

@jmspaggi
I got a WARN as follows:

2016-10-07 13:06:34,383 WARN [fetcher#8] org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to 192.168.30.122:13562 with 6 map outputs
java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
        at java.net.SocketInputStream.read(SocketInputStream.java:170)
        at java.net.SocketInputStream.read(SocketInputStream.java:141)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
        at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704)
        at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1536)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)
        at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
        at org.apache.hadoop.mapreduce.task.reduce.Fetcher.verifyConnection(Fetcher.java:430)
        at org.apache.hadoop.mapreduce.task.reduce.Fetcher.setupConnectionsWithRetry(Fetcher.java:395)
        at org.apache.hadoop.mapreduce.task.reduce.Fetcher.openShuffleUrl(Fetcher.java:266)
        at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:323)
        at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)

Maybe this is the real problem

@jmspaggi
Copy link

jmspaggi commented Oct 9, 2016

This is "just" a timeout so MR should retry. Your table is completely
empty? Nothing? And not any other issue? Can you change that to DEBUG mode
and retry to see what it says?

JMS

2016-10-08 22:59 GMT-04:00 Andrew Zhang [email protected]:

@jmspaggi https://github.com/jmspaggi
I got a WARN as follows:
2016-10-07 13:06:34,383 WARN [fetcher#8] org.apache.hadoop.mapreduce.task.reduce.Fetcher:
Failed to connect to 192.168.30.122:13562 with 6 map outputs
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:170)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(
HttpURLConnection.java:1536)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(
HttpURLConnection.java:1441)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.
verifyConnection(Fetcher.java:430)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.
setupConnectionsWithRetry(Fetcher.java:395)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.
openShuffleUrl(Fetcher.java:266)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.
copyFromHost(Fetcher.java:323)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)
maybe this is the real problem


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#4 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AC4MY9ujXF38aEgMAj0Yl84J75Dm41Y6ks5qyFicgaJpZM4FwUq8
.

@davidonlaptop
Copy link
Member Author

Also, you could try with a subset of your data first, like the first 10 lines.

@startprogress
Copy link

@davidonlaptop
I've tried. A subset of my data worked fine.

@startprogress
Copy link

@jmspaggi
At the beginning, I used a table with a namespace named 'n1:t1'. And after I changed the namespace back to default, it worked well and the table is with data. I don't know why but it's kind of a workaround.

@jmspaggi
Copy link

Oh, interesting! So you are saying that doing ImportTsv with namespace
seems to not be working? Do you have the exact command line you used for
that? Have you tried to put the namespace:table within quotes?

JMS

2016-10-10 5:34 GMT-04:00 Andrew Zhang [email protected]:

@jmspaggi https://github.com/jmspaggi
At the beginning, I used a table with a namespace named 'n1:t1'. And after
I changed the namespace back to default, it worked well and the table is
with data. I don't know why but it's kind of a workaround.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#4 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AC4MY0x1PG-SLmSDFiMp7_jbFLX7j6fgks5qygaRgaJpZM4FwUq8
.

@startprogress
Copy link

startprogress commented Oct 10, 2016

@jmspaggi
Yes, it's really strange. The exact command line:

        hbase org.apache.hadoop.hbase.mapreduce.ImportTsv \
        -Dimporttsv.columns=HBASE_ROW_KEY,f:v \
        -Dimporttsv.bulk.output=$tempdir \
        -Dmapred.min.split.size=$minSplit \
        -Ddfs.umaskmode=000 \
        -Dmapreduce.map.memory.mb=$trueMapMemmb  \
    -Dmapreduce.map.java.opts.max.heap=$truemapjava \
        -Dmapreduce.reduce.memory.mb=$reduceMem  \
        -Dmapreduce.reduce.java.opts.max.heap=$reduceJava \
        -Dmapreduce.reduce.cpu.vcores=$reduceVcores \
     tableName hdfspath

Those parameters are calculated from the size of the file and the configuration of my cluster.

@davidonlaptop
Copy link
Member Author

davidonlaptop commented Oct 11, 2016

@startprogress : FYI. I just pushed a new Dockerfile with all the latest versions (HBase 1.2.3, OpenJDK 8, Ubuntu 16.04) if you want to give it a try. Maybe, it can solve your issue.

@jmspaggi
Copy link

@startprogress what is the value of tableName? Any chance to try with
"tableName" instead?

2016-10-11 9:54 GMT-04:00 David Lauzon [email protected]:

@startprogress https://github.com/startprogress : FYI. I just pushed a
new Dockerfile with all the latest versions (HBase 1.2.3, OpenJDK 8, Ubuntu
16.04) if you want to give it a try.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#4 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AC4MY3ilelW54Je6w7XxDid3xB8o7FPXks5qy5UagaJpZM4FwUq8
.

@startprogress
Copy link

@jmspaggi tableName is changed to 't1' instead of 'n1:t1', the command worked well now.

@startprogress
Copy link

startprogress commented Oct 12, 2016

@davidonlaptop Ok, thx for ur help.

@jmspaggi
Copy link

@startprogress What I'm wondering is that, can you give "n1:t1" instead of
n1:t1 (Note the double quotes). what I'm suppecting is the shell to
interpret ":".
@david does it work for you with another namespace? Make sure it is created
before using it.

2016-10-12 5:13 GMT-04:00 Andrew Zhang [email protected]:

@davidonlaptop https://github.com/davidonlaptop Ok, thx for ur help
anyway. I will have a try sometime later.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#4 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AC4MY2d--w-pvKUNzVUwIFFKGCLi0LtOks5qzKTDgaJpZM4FwUq8
.

@startprogress
Copy link

@jmspaggi
Oh, I got it. I've used the double quotes. Didn't work either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants