Posts
Showing posts from July, 2020
Data Science: Regression: Part1
- Get link
- X
- Other Apps
Regression is one of the most widely used supervised learning model in the field of data science. It finds its use in the field of Sales forecasting, satisfaction analysis, price estimation, employment income etc. The types of Regression are as follows:- 1. Simple Regression a. Simple Linear Regression b. Simple Non-linear Regression 2. Multiple Regression a. Multiple linear Regression b. Multiple non-linear Regression Pros of using linear Regression model:- 1. Very Fast 2. No Parameter tuning required 3. Easy to understand and highly interpret-able In order to implement regression we need a data. Suppose our dataset is residing in sample.csv We need to construct dataframes out of this sample dataset to be able to implement regression. We would be using python libraries "pandas", "scikit-learn" and "numpy" for the purpose import pandas as pd import numpy as np from sklearn import linear_model # creating d...
Azure Data analytics: Part2: Benfits of using azure data lake storage Gen2
- Get link
- X
- Other Apps
Azure has bee providing great storage capabilities using Azure storage accounts and Azure Blob storage. Then why go in for Azure Data Lake storage? Below are the key benefits of using Azure Data Lake storage Gen2 over a normal storage for your analytics needs:- 1. Has tiering and data life cycle management capabilities. 2. Provides High availability, security and durability. 3. Designed for handling exabyte scale of data providing a throughput of hundreds of Gigabytes hence can be used for both real-time and batch solutions 4. Hadoop Compatible Access: You can treat the data as if it is stored in a Hadoop Distributed File system hence can be used directly with Azure DataBricks, HDInsights, Azure Synapse Analytics without moving data between environments 5. Security: Supports Access Control Lists (ACL) and Portable Operating System Interface (POSIX) permissions. You can set permissions at Directory level and file level. Data stored at rest is encrpted using Microsoft or customer manage...
Azure Data Analytics: Part1: Hosting Data Lake storage: Gen1 and Gen2
- Get link
- X
- Other Apps
In this article we will cover hosting Azure data Lake storage Gen1 and gen2 1. Go to Resource Groups -> create new -> give a name 2. Under resource group -> add resources -> Storage -> Storage Account give storage account name, location Click Next to go to advanced tab 3. Under Data Lake storage Gen2 enable Hierarchical namesapce 4. Click create for gen1 Data Lake storage 1. Under resource group -> add resources -> Storage -> Data Lake Storage Gen1 2. Enter name and click create