Posts Tagged: ‘#workloadautomation’

Upload/Download your Azure Storage files by using Azure Data Lake Storage Plugin with Workload Automation

19. Mai 2021 Posted by Shubham Chaurasia

Let us begin with understanding of Azure what it is all about before moving to our Azure Storage plugin and how it benefits our workload automation users.

“Azure is an open and flexible cloud platform that enables you to quickly build, deploy and manage applications across a global network of Microsoft-managed datacentres. You can build applications using any language, tool, or framework. And you can integrate your public cloud applications with your existing IT environment.”

Azure is incredibly flexible, and allows you to use multiple languages, frameworks, and tools to create the customised applications that you need. As a platform, it also allows you to scale applications up with unlimited servers and storage.

What is an Azure Storage Account?

The Azure Storage platform is Microsoft’s cloud storage solution for modern data storage scenarios. Core storage services offer a massively scalable object store for data objects, disk storage for Azure virtual machines (VMs), a file system service for the cloud, a messaging store for reliable messaging, and a NoSQL store.

An Azure storage account contains all your Azure Storage data objects: blobs, files, queues, tables, and disks. The storage account provides a unique namespace for your Azure Storage data that is accessible from anywhere in the world over HTTP or HTTPS. Data in your Azure storage account is durable and highly available, secure, and massively scalable.

Core storage services

The Azure Storage platform includes the following data services:

  • Azure Blobs: A massively scalable object store for text and binary data. Also includes support for big data analytics through Data Lake Storage Gen2.
  • Azure Files: Managed file shares for cloud or on-premises deployments.
  • Azure Queues: A messaging store for reliable messaging between application components.
  • Azure Tables: A NoSQL store for schemeless storage of structured data.
  • Azure Disks: Block-level storage volumes for Azure VMs.

Introduction to Azure Data Lake Storage Gen2

Azure Data Lake Storage Gen2 is a set of capabilities dedicated to big data analytics, built on Azure Blob storage. Data Lake Storage Gen2 converges the capabilities of Azure Data Lake Storage Gen1 with Azure Blob storage. For example, Data Lake Storage Gen2 provides file system semantics, file-level security, and scale. Since these capabilities are built on Blob storage, it provides low-cost, tiered storage, with high availability/disaster recovery capabilities.

Figure 1 Azure Data Lake gen2

Let us clearly understand the benefits with the following example:

Cloud computing has enabled many teams to adopt agile development methods. They need to repeatedly deploy their solutions to the cloud, and know their infrastructure is in a reliable state. As infrastructure has become part of the iterative process, the division between operations and development has disappeared. Teams need to manage infrastructure and application code through a unified process.

To meet these challenges, you can automate upload/download multiples files and use the practice of infrastructure as code.

Using Azure SPN (Service principal Name) credentials or access key user can login and can select the available container in the storage account (Azure).

Instead of using Azure portal, you can upload/download an existing file by using Azure Storage plugin with workload Automation. Using Azure SPN credentials or access key, user can login and can see all the available files in the server (Azure Storage – Data lake gen2).

Let us begin with our plugin part with job definition parameters

Azure Storage Plugin

Log in to the Dynamic Workload Console and open the Workload Designer. Choose to create a new job and select “Azure Data Lake Storage Plugin” job type in the Cloud section.

Figure 2 Job Definition

Connection Tab

Establishing connection to the Azure server:

Connection Info

Use this section to connect to the Azure server.

Subscription – The ID that uniquely identifies your subscription to Azure. This attribute is required. If not specified in the job definition, it must be supplied in the plug-in properties file.

Client – The Azure Client ID associated to your SPN account. This attribute is required. If not specified in the job definition, it must be supplied in the plug-in properties file.

Tenant – The Azure Tenant ID associated to your SPN account. This attribute is required. If not specified in the job definition, it must be supplied in the plug-in properties file.

Password (Key) – The Azure Client Secret Key associated to your SPN account. This attribute is required. If not specified in the job definition, it must be supplied in the plug-in properties file. This is also known as client key.

Account Name – The account name associated to your Azure Data Storage account.

Test Connection – Click to verify that the connection to the Azure server works correctly.

Figure 3 connection tab – SPN

OR

Access Key Authentication

Account Name – The account name associated to your Azure Data Storage account.

Access Key – Use this option to authorize access to data in your storage account.

Figure 4 Connection Tab – Access key

Action Tab

Use this section to define the operation details.

Operation

Container Name– Specify the name of the container in which the files are stored. Click the Select button to choose the container name defined in the cloud console. Select an item from the list, the selected item is displayed in the Container Name field.

Figure 5 Action Tab – Select Container

Select Operations

-Use this section to either upload or download objects.

Figure 6 Action tab – upload

Upload File – Click this radio button to upload files to the Storage Account.

Folder Location Inside Container– Enter the name of the file to be uploaded or the path of the file stored. Click the Search button to choose the file name defined in the cloud console. Select an item from the list, you can select multiple files. The selected item is displayed in the Folder Location Inside Container field.

Source File Paths – Displays the path of the source file. You can use the filter option to streamline your search.

If a file already exists– Select an appropriate option for the application to perform if the uploaded file already exists in the console.

· Replace – Selecting this option replaces the already existing file in the console.

· Skip – Selecting this option skips the upload of the selected file in the console.

Download File – Click this radio button to download files from the Storage Account.

Figure 7- Action tab – Download

Select Files– Click the Select Files button to choose the file name defined in the cloud.

Destination File Path – Provide the location to download or upload files. Click the Select button to choose the location of the source file, the selected item is displayed in the Destination File Path field.

Submitting your job

It is time to Submit your job into the current plan. You can add your job to the job stream that automates your business process flow. Select the action menu in the top-left corner of the job definition panel and click on Submit Job into Current Plan. A confirmation message is displayed, and you can switch to the Monitoring view to see what is going on.

Figure 8 Submit Job

Figure 9 Monitor Job

Figure 10 Monitor Job

Figure 11 Job Log

Figure 12 Workflow Details

Are you curious to try out the Azure Data Lake Storage plugin? Download the integrations from the Automation Hub and get started or drop a line at santhoshkumar.kumar@hcl.com.

 

Authors Bio

Shubham Chaurasia – Developer at HCL Software

Responsible for developing integration plug-ins for Workload Automation. Hands-on with different programming languages and frameworks like JAVA, JPA, Microservices, MySQL, Oracle RDBMS, AngularJS.

LinkedIn – https://www.linkedin.com/in/shubham-chaurasia-1a78b8a9/

 

Rabic Meeran K, Technical Specialist at HCL Technologies

Responsible for developing integration plug-ins for Workload Automation. Hands-on with different programing languages and frameworks like JAVA, JPA, Spring Boot, Microservices, MySQL, Oracle RDBMS, Ruby on Rails, Jenkins, Docker, AWS, C and C++.

LinkedIn – https://www.linkedin.com/in/rabic-meeran-4a828324/

 

Saket Saurav, Tester (Senior Engineer) at HCL Technologies

Responsible for performing Automation and Manual Testing for different plugins in Workload Automation using Java Unified Test Automation Framework. Hands-on experience on Java programming language, Web Services with databases like Oracle and SQL Server

LinkedIn – https://www.linkedin.com/in/saket-saurav-8892b546/