Hadoop

Apache™ Hadoop® is an  open-source framework that allows large data sets to be processed and distributed across commodity cluster computers.

Share tech news, updates, or what's on your mind.

Sign up to Post

hi,

how can MySQL work with / load and save data from Hadoop ? any build in tools for it ?

is it scalable solution ?
0
Introducing the "443 Security Simplified" Podcast
LVL 1
Introducing the "443 Security Simplified" Podcast

This new podcast puts you inside the minds of leading white-hat hackers and security researchers. Hosts Marc Laliberte and Corey Nachreiner turn complex security concepts into easily understood and actionable insights on the latest cyber security headlines and trends.

hi,

any product from DB2 can get data from Hadoop and store in a structured format?
0
hi,

any oracle product that help to transfer data in and out from Hadoop by using single PL SQL language ?

and also can do parallel data processing for that feature ?
0
hi,

anyone use polybase on MSSQL for hadoop ? is scale out feature of Polybase working fine? load balancing working well ?
0
hi,

anyone know how to intergrate MariaDB and Mongo DB so that they work together well ?

how about MariaDB and hadoop?
0
hi,

anyone know how to intergrate MS SQL and Mongo DB so that they work together well ?

how about MS SQL and hadoop?
0
I have been asked to stand up a weighted search appliance for a company.  The decision was to use SOLR to create the search tool so they can use the associated REST API for searches and recommendations.

I'm am still beginning in SOLR and have to ask a basic architecture question.  I have a table with 220 elements, 130 Million record strong.  I grow 5 million a year.

Does this become a Hadoop solution?  or can this still be done with a single SOLR engine?  I need to know which direction to start with so I do this right

Thanks much.
0
Hello Experts,

The following Hive Script retrieves data from the hdfs dfs drive on hadoop from the directory '/user/hive/geography'

I would like to store the results on a local drive called /hadoop/hdfs'

Can someone please show me how to modify the script so that it doesn't retrieve and store the results of the query to 'user/hive/geography', but instead stores the results from the query to '/hadoop/hdfs' (or any local drive)

The script is as follows:

DROP TABLE IF EXISTS HiveSampleIn; 
CREATE EXTERNAL TABLE HiveSampleIn 
(
 anonid int,
 eprofileclass int,
 fueltypes STRING,
 acorn_category int,
 acorn_group STRING,
 acorn_type int,
 nuts4 STRING,
 lacode STRING,
 nuts1 STRING,
 gspgroup STRING,
 ldz STRING,
 gas_elec STRING,
 gas_tout STRING
) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '10' STORED AS TEXTFILE LOCATION '/user/hive/geography'; 

DROP TABLE IF EXISTS HiveSampleOut; 
CREATE EXTERNAL TABLE HiveSampleOut 
(

acorn_category int,
acorn_categorycount int )

ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '10' STORED AS TEXTFILE LOCATION '/user/hive/geography';


INSERT OVERWRITE TABLE HiveSampleOut
Select 
   acorn_category,
   count(*) as acorn_categorycount 
FROM HiveSampleIn Group by acorn_category

Open in new window


Thanks
0
Techies, Can someone isolate where I'm dropping the ball on getting the regex matches I'm expecting in this dataflow? My goal is to move the matched versions over to 1 kafka topic and  the unmatched over to another kafka topic.  Attached is the client.csv test file


Here's what the data flow looks like--with the regex used in the ExtractText config. NiFi uses Java's version of regular expressions.  

ExtractFileProcessorNiFiwithRegexclient.csv
0
Hello Community,

I have created my first hql code, see below and I can't get any data to appear.. I have recently installed Sandbox. The installation comes with a few sample databases. I'm using the database called sample_07 to guide me with my own .hql code.

My hql code is as follows:

CREATE EXTERNAL TABLE mysample
(
 code STRING,
 description STRING,
 total_emp INT,
 salary INT
)
ROW FORMAT DELIMITED
 FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION '/root/music'
TBLPROPERTIES ("skip.header.line.count" = "1");

Open in new window


However, when I run the code using Zeppellin Notebook with the following code, I can see the tables, but no data appears

%jdbc(hive)
select * from mysample limit 14

Open in new window


However, when I run the same code, with using the sample database called sample_07 both the tables and data appear.

csharp

I'm sure there is something very simple that I'm missing.

Can someone please let me know where I'm going wrong?
0
The IT Degree for Career Advancement
The IT Degree for Career Advancement

Earn your B.S. in Network Operations and Security and become a network and IT security expert. This WGU degree program curriculum was designed with tech-savvy, self-motivated students in mind – allowing you to use your technical expertise, to address real-world business problems.

Hi Experts.

I'm having trouble configuring Flume to stream data from a website to my HDFS. As some tutorial i've read on the Internet such as TutorialPoint, Hadoop Pravendees... They all have the same example that stream data from Twitter to HDFS using Twitter Apps API.

Is there any source code PHP, Java or ASP.NET to do this without getting token like that example? The thing i want to do is setup an Agent in the website i want to get data and have data stream to my HDFS architecture.

Thanks for reading this, best regards.
0
I'm in the Business Intelligence Department, but practically speaking we're the Reporting Department, your basic operational type of reports - lists, lists, and more lists.

I'm at an institution of higher learning, and a new project has come up for the Math Department. They want to know relationships between courses, grades, etc.

Examples:

- if someone gets a D in Calc I, what's the likelihood of graduation?  with various permutations, like taking Calc I again
- what's the likelihood of someone getting a D in Calc I, getting a D or F in Calc II
- for placing incoming students in Pre-Calc or Calc I, what are the factors that indicate success? such as Verbal SAT

So I think I've targeted the right discipline (Analytics), but not sure where to take this project.
1
I've loaded several months of data to Hive using SAS. I have confirmed that everything loaded successfully and can query the data with no issues.
However, when I move to using an Impala editor (local here Hue/Hadoop) and refresh/update the tables, I get an error when running the following query: SELECT * from data_table LIMIT 1000.

The error is:
Your query has the following error(s):
IllegalStateException: null

Seems that it cant see the table.
Any ideas?
0
Hello,

I have just started to work on bigdata hadoop configuration.  I have a quick question regarding admin namenode.  Does name node to be clustered? Because, if that goes down, there won't be no access to data node.
0

Hadoop

Apache™ Hadoop® is an  open-source framework that allows large data sets to be processed and distributed across commodity cluster computers.

Top Experts In
Hadoop
<
Monthly
>