Posts

Migrating Data within Azure

 At times you are faced with requirements to move data within Azure from one storage account to another for multiple reasons say Reorganization, billing constraints, Data Usability, Data Governance etc. To move all the containers within your storage account along with all the blobs present inside each of the container presents a challenging situation. Well "azcopy" comes to the rescue. It enables you to either selectively copy only certain blobs inside your containers or copy specific containers from one storage account to another. Besides it can also enable you to copy your entire storage account contents along with all the containers and blobs present in it from one storage account to another.  Below is an example highlighting the syntax to use to copy all the contents of one storage account including containers and blobs to another storage account. The storage accounts can be in the same region or different. azcopy copy 'https://<source-storage-account-name>.blob...

python3: unpickling error

 working with pickle files can be tricky in python. At times when loading the pickle file using the load method of the pickle module you might get an error as below: UnpicklingError : invalid load key, '\xef'. Please note this error is caused by corruption in the data of the pickle file which might have been caused due to copy operation of the pickle file or other reasons.  You can attempt to re-copy the pickle file and save again to make this error go away

Use of bin/bash in shell script

 Whenever we write a shell script we have come across /bin/bash in the beginning of the script. In this article we would explore what it is and why it is used There are two types of shells in unix system- bash shell and system shell If you refer #!bin/sh in the script it would be run in default system shell If you refer #!bin/bash in the script it would be run in bash shell Its referred as a commend since the first character is # which is used for commenting

Getting Data from AWS data lake into Azure Data Lake

 With Organizations becoming more cloud agnostic time and again a need arises to migrate your data from one cloud platform to another to make use of the ETL and Machine Learning platform on another cloud vendor.  In this article we would be covering how to migrate your data from AWS S3 data Lake to Azure Data Lake storage account. We would be using az copy for the purpose. Step1. Authenticate to AWS account where the AWS data lake resides in. For this purpose we would be using the User security Access key and ID set AWS_ACCESS_KEY_ID=<value> set AWS_SECRET_ACCESS_KEY=<value> Step2: Authenticate to the Azure Data Lake storage account. For this purpose we would be using the security Access tokens generated from the storage account where the data lake resides in and embed the same in the URL of the container where we want to put our data in Step3- Initiate the data transfer via Az Copy azcopy copy "https://azuremstnks.s3.ap-south-1.amazonaws.com/weather/" "https...

Statefule Sets and Config Map

 ConfigMap is an API object that allows you to store non-confidential data as key value pairs. It allows to decouple environment specific information from your container images making them more portable. Pods can consume config maps as environment variables, command line arguments or config files in volumes. Stateful set is workload API object used to manage stateful applications. 

Difference between docker cmd and docker entrypoint instructions and RUN COPY and ADD

 Docker cmd helps define the default command and or parameters. The parameters of a CMD instruction can be over-ridden. If a dockerfile has several CMD only the last will be run. Docker entrypoint helps to define the container as an executable. RUN is an instruction that runs at the build time of the container. CMD runs when the container is starting COPY command copies files from local directory to a directory in the container ADD apart from the COPY functionality also allows to copy from URL or untars a tar file from source into the destination in the container Shell Form -> Instruction <command> Exec form -> Instruction ["executable", "param1", "param2"]

EKS nuggets: Unauthorized error via kubectl

 EKS is the managed kubernetes service offering from AWS. After you have provisioned an EKS cluster you can access it via the kubectl utility. However, the kubectl utility requires you to have kubeconfig available in your home directory for it to be able to access the EKS cluster kubeApiServer. At time while running kubectl commands you might get error as below:- error: You must be logged in to the server (Unauthorized) This would indicate that the user with which you are logged into the AWS CLI is not the one who has provisioned the cluster. AWS adds the user who has provisioned the cluster as a cluster admin in the EKS cluster RBAC. To download EKS kubeconfig file you can run aws cli command as below:- aws eks --region ap-south-1 update-kubeconfig --name <clustername>