Hadoop

Apache™ Hadoop® is an  open-source framework that allows large data sets to be processed and distributed across commodity cluster computers.

Share tech news, updates, or what's on your mind.

Sign up to Post

hi,

how can MySQL work with / load and save data from Hadoop ? any build in tools for it ?

is it scalable solution ?
0
10 Tips to Protect Your Business from Ransomware
10 Tips to Protect Your Business from Ransomware

Did you know that ransomware is the most widespread, destructive malware in the world today? It accounts for 39% of all security breaches, with ransomware gangsters projected to make $11.5B in profits from online extortion by 2019.

hi,

any Hadoop to DB2 gateway/proxy scaleoutable solution for DB2?
0
hi,

any product from DB2 can get data from Hadoop and store in a structured format?
0
hi,

any oracle product that help to transfer data in and out from Hadoop by using single PL SQL language ?

and also can do parallel data processing for that feature ?
0
hi,

anyone use polybase on MSSQL for hadoop ? is scale out feature of Polybase working fine? load balancing working well ?
0
hi,

anyone know how to intergrate MariaDB and Mongo DB so that they work together well ?

how about MariaDB and hadoop?
0
hi,

anyone know how to intergrate MS SQL and Mongo DB so that they work together well ?

how about MS SQL and hadoop?
0
I have been asked to stand up a weighted search appliance for a company.  The decision was to use SOLR to create the search tool so they can use the associated REST API for searches and recommendations.

I'm am still beginning in SOLR and have to ask a basic architecture question.  I have a table with 220 elements, 130 Million record strong.  I grow 5 million a year.

Does this become a Hadoop solution?  or can this still be done with a single SOLR engine?  I need to know which direction to start with so I do this right

Thanks much.
0
Hello Experts,

The following Hive Script retrieves data from the hdfs dfs drive on hadoop from the directory '/user/hive/geography'

I would like to store the results on a local drive called /hadoop/hdfs'

Can someone please show me how to modify the script so that it doesn't retrieve and store the results of the query to 'user/hive/geography', but instead stores the results from the query to '/hadoop/hdfs' (or any local drive)

The script is as follows:

DROP TABLE IF EXISTS HiveSampleIn; 
CREATE EXTERNAL TABLE HiveSampleIn 
(
 anonid int,
 eprofileclass int,
 fueltypes STRING,
 acorn_category int,
 acorn_group STRING,
 acorn_type int,
 nuts4 STRING,
 lacode STRING,
 nuts1 STRING,
 gspgroup STRING,
 ldz STRING,
 gas_elec STRING,
 gas_tout STRING
) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '10' STORED AS TEXTFILE LOCATION '/user/hive/geography'; 

DROP TABLE IF EXISTS HiveSampleOut; 
CREATE EXTERNAL TABLE HiveSampleOut 
(

acorn_category int,
acorn_categorycount int )

ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '10' STORED AS TEXTFILE LOCATION '/user/hive/geography';


INSERT OVERWRITE TABLE HiveSampleOut
Select 
   acorn_category,
   count(*) as acorn_categorycount 
FROM HiveSampleIn Group by acorn_category

Open in new window


Thanks
0
Techies, Can someone isolate where I'm dropping the ball on getting the regex matches I'm expecting in this dataflow? My goal is to move the matched versions over to 1 kafka topic and  the unmatched over to another kafka topic.  Attached is the client.csv test file


Here's what the data flow looks like--with the regex used in the ExtractText config. NiFi uses Java's version of regular expressions.  

ExtractFileProcessorNiFiwithRegexclient.csv
0
Why Diversity in Tech Matter
LVL 12
Why Diversity in Tech Matter

Kesha Williams, certified professional and software developer, explores the imbalance of diversity in the world of technology -- especially when it comes to hiring women. She showcases ways she's making a difference ithrough the Colors of STEM program.

Hello Community,

I have created my first hql code, see below and I can't get any data to appear.. I have recently installed Sandbox. The installation comes with a few sample databases. I'm using the database called sample_07 to guide me with my own .hql code.

My hql code is as follows:

CREATE EXTERNAL TABLE mysample
(
 code STRING,
 description STRING,
 total_emp INT,
 salary INT
)
ROW FORMAT DELIMITED
 FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION '/root/music'
TBLPROPERTIES ("skip.header.line.count" = "1");

Open in new window


However, when I run the code using Zeppellin Notebook with the following code, I can see the tables, but no data appears

%jdbc(hive)
select * from mysample limit 14

Open in new window


However, when I run the same code, with using the sample database called sample_07 both the tables and data appear.

csharp

I'm sure there is something very simple that I'm missing.

Can someone please let me know where I'm going wrong?
0
Hi Experts.

I'm having trouble configuring Flume to stream data from a website to my HDFS. As some tutorial i've read on the Internet such as TutorialPoint, Hadoop Pravendees... They all have the same example that stream data from Twitter to HDFS using Twitter Apps API.

Is there any source code PHP, Java or ASP.NET to do this without getting token like that example? The thing i want to do is setup an Agent in the website i want to get data and have data stream to my HDFS architecture.

Thanks for reading this, best regards.
0
I'm in the Business Intelligence Department, but practically speaking we're the Reporting Department, your basic operational type of reports - lists, lists, and more lists.

I'm at an institution of higher learning, and a new project has come up for the Math Department. They want to know relationships between courses, grades, etc.

Examples:

- if someone gets a D in Calc I, what's the likelihood of graduation?  with various permutations, like taking Calc I again
- what's the likelihood of someone getting a D in Calc I, getting a D or F in Calc II
- for placing incoming students in Pre-Calc or Calc I, what are the factors that indicate success? such as Verbal SAT

So I think I've targeted the right discipline (Analytics), but not sure where to take this project.
1
I've loaded several months of data to Hive using SAS. I have confirmed that everything loaded successfully and can query the data with no issues.
However, when I move to using an Impala editor (local here Hue/Hadoop) and refresh/update the tables, I get an error when running the following query: SELECT * from data_table LIMIT 1000.

The error is:
Your query has the following error(s):
IllegalStateException: null

Seems that it cant see the table.
Any ideas?
0
Hello,

I have just started to work on bigdata hadoop configuration.  I have a quick question regarding admin namenode.  Does name node to be clustered? Because, if that goes down, there won't be no access to data node.
0

Hadoop

Apache™ Hadoop® is an  open-source framework that allows large data sets to be processed and distributed across commodity cluster computers.

Top Experts In
Hadoop
<
Monthly
>