October 14, 2016

Generate Word documents with PowerShell

In the previous post I mentioned that I was asked to produce a PDF document with source code for a library. In that post I reviewed how to detect and fix file encoding and process files line by line. In this post I'll show how to create Word/PDF documents from PowerShell.

Changing source files encoding and some fun with PowerShell

One day I was asked to assist in creating a PDF-document with all source code of some our library. That weird task is needed for patenting our product. It's possible to do manually indeed but it's humiliating for a dev. Obviously it can be easy automated in many ways. It turned out to be a fun adventure.

First of all I describe a context - the library is .NET solution with almost 3000 C# files. It's a little bit big :).
The first issue I encountered was the fact that the solution contains files in different encodings. All comments were written in Russian. So some files were in ASCII-based encoding (windows-1251) but the others in utf-8. It's a mess. So the first idea was to normalize all files in utf-8 encoding despite the fact that comments themselves are not needed for this task. But besides comments sources contain strings with national letters so encoding is important anyway. By "important" I mean that it should be known to read and process a file.

In this post I'll talk about converting encoding and in the next about generating Word/PDF files.

December 21, 2015

Windows 10 and AMD Radeon HD4850

Someday I installed Windows 10 on my work PC. Well I did it after I installed Win10 on my home PC and have been living on it for a while. As I didn't encounter anything terrible wrong I decided to migrate my work PC also. But not everything was great in this process.

The first issue I encountered was with video adapter. I have Radeon HD4850 on that machine. After Windows rebooted to complete installing I got Microsoft Basic Adapter in Device manager and lost multiple monitors support. Basic Adapter doesn't support multiple monitors and just shows the same picture on all of them. It's funny to have stereo picture on both monitors indeed but actually it's a disaster. Updating drivers via Device manager didn't work so I went to AMD site to download a new driver. But it turned out that AMD decided to not support 4000 Radeon series in Windows 10 at all. Just no drivers. "Nobody makes you to upgrade to Windows 10", they say. Nice shot AMD.
Here's a example of thread - https://community.amd.com/message/2660992, which AMD forums full of.
It's a nice answer: "The 4xxx series cards are not supported on Windows 10 as they do not meet the minimum requirements". Ridiculous!
Here's Windows 10 requirements:
Graphics card:
DirectX 9 or later with WDDM 1.0 driver

What the hell AMD?!

So the first conclusion for this day: do not buy AMD cards anymore. Ever. Just don't.

When I was close to despair I decided to try integrated video card on my motherboard and rebooted to go into BIOS. In BIOS there is a switch for "active video" with options: Auto (was enabled), PCI, PCI-E, Internal:
I choose "Internal" (IGD) and rebooted. But I forgot to switch vga/dvi cables and decided to wait to see what will happen. To my surprise nothing changed - both monitors still attached to my HD4850 (which is on PCI-E obviously) worked as before. I logged into Windows, went to Device manager and tried to update driver for "Microsoft Basic Adapter". And it was a miracle! Windows started to download a new driver and installed "ATI Radeon HD 4800 Series".

After the new driver was installed multiple monitor support works as it should.

August 12, 2015

Organize your image files

These days we have many sources of our photos: smartphones, cameras, old cameras, old phones and so on. It is common to use some sort of cloud services for keeping our photos. It is very handy to have them automatically uploaded to the cloud from a device. But currently it works good for smartphones mostly. So despite of using cloud services like Google Photos/Drive, OneDrive, DropBox, Amazon Cloud Drive and the same you probably have all your photos on your hard drives and organize them in some kind of folder structure.
There are a lot of software to help in organizing files and photos in particular. For me is was always a problem here - I don't want a magical piece of software to hide my files from me. This is why I like Google Picasa. It provides nice UI over folder structure and synchronize with it. But this time I wanted to talk not about organizing files in folder structure but managing files themselves.

For image files we have the following important properties:
  • file name
  • modified date - file attribute
  • date taken - EXIF metadata
All these file properties can help in managing files. It's nice when a file's name contains its timestamp - the date and time the photo was taken. It nice when a file's modified date (it's a file attribute supported by any OS/file system) is the same as its timestamp.

Unfortunately often some of these properties are incorrect or mixed up. Let's review all of them.


For files from cameras it's common to have file names like IMG_1234.jpg. For files from smartphones it's common to have 20150812_174054.jpg. It's much better but different OSes/devices use different patterns. So it could be also IMG_20150812_174054.jpg or something else.

It can be helpful to have all names in the same pattern. Obviously the name '20150812_174054.jpg' is more informative than 'IMG_1234.jpg'. It tells us when the photo was taken without looking up in its metadata.

Modification date

It's important to have correct modification date for files. It allows us to sort all files in our collection by their "modified data" attribute values. Not all software support reading EXIF metadata. Moreover that metadata can be missing. Especially it's important for cloud services as they usually allows you to look through all your photos in one timeline.

Unfortunately modification date values are often updated by software on some operations with files. For example when we rotate a photo in Windows Image Viewer it updates modification date. Technically it's correct as the file did change. But usually I don't care on such modifications and want to keep a date when the photo was taken instead of a date it was changed in software.


It's metadata put inside image files by cameras/smartphotos to keep some additional info like camera model, exposure, f-stop and so on. It's source of the truth for timestamps. We need 'date taken' meta attribute.

Organize them all

Here's some common steps which helps us organize photo files:
  • Extract EXIF metadata if they exist and put 'date taken' value into 'modification date' file attribute.
  • If no EXIF metadata then try to extract timestamps from file names (20150812_174054.jpg)
  • Rename all files in such a way a file name contains its timestamp (20150812_174054.jpg)
  • Remove unneeded prefixes like "IMG_" (IMG_20150812_174054.jpg), but keep those ones which provide some info (like "PANO_" for panoramas)
I was looking for software which would support these tasks for me but finally gave up and created a simple script. Meet Fix-TS.ps1.

Let's get to know with it by examples.

powershell ./fix-ts.ps1 /path/to/ -source exif
Update all files timestamps in a folder using values from EXIF metadata (if they exist).

powershell ./fix-ts.ps1 /path/to/ -filter *.jpg,*.png
Update *.jpg and *.png files timestamps in a folder using values parsed from file names,
e.g. for '20151231_245959.jpg' timestamp will be 2012 December 31 24:59:59.

powershell ./fix-ts.ps1 /path/to/ -rename remove-prefix
Remove all prefixes before year part, e.g. 'IMG_20151207_245959.jpg' will be renamed to '20151207_245959.jpg'.

powershell ./fix-ts.ps1 /path/to/ -rename remove-prefix:!PANO
Remove all prefixes except 'PANO', i.e. 'PANO_20151207_245959.jpg' will not change but 'IMG_20151207_245959.jpg' will become '20151207_245959.jpg'.

powershell ./fix-ts.ps1 /path/to/ -rename add-prefix:jpg=IMG_|mp4=VID_|avi=VID_
Add prefix 'IMG_' for all *.jpg, add prefix 'VID_' for all *.mp3 and *.avi files.

powershell ./fix-ts.ps1 /path/to/ -rename rebuild -source exif
Rename all files using pattern `yyyyMMdd_hhmmss` with timestamps from their EXIF metadata.

Please note that if run the script without `-fix` switch it won't change anything, only reports about found issues/proposed fixes. And only running with `-fix` switch makes it apply fixes.

You can find the script on Github.

Hope it helps someone to keep things more organized.

June 1, 2015

Publish files to Artifactory with artifactory-publisher

In the previous post I already shared some experience on setting up Artifactory. Here I'll continue to play with Artifactory. Now let's talk about publishing artifacts.

I needed to publish a lot of NuGet packages we already had into our new Artifactory. There are three possibilities to publish an artifact to Artifactory:
  • use CLI tool (NuGet, NPM and so on) of a package manager
  • use web UI ("Deploy" tab) in Artifactory server
  • use REST API
Using web UI is obviously tedious as we need to publish a lot of packages.
Publishing via NuGet.exe cannot control folder/files layout in Artifactory. In this case the layout is determined by Repository Layout which is actually a regular expression and so it's pretty limited (it never worked for me).
So the most powerful method is to use REST API. We just need to send a file in a POST request to a desired url.

I came up with a tiny tool artifactory-publisher to publish files to Artifactory via its REST API.
You can find source code on Github. And install it from npmjs.com:
npm i artifactory-publisher -g

The tool can be used as a CLI tool and as a Node package.

The detailed documentation can be found in README. Here's an usage example.

Let's consider that we have a local folder with a lot of nuget packages (*.nupkg) which we want to publish into a repository but structure them into different folders.

For example we have the following packages locally:

and we want to have them in our Artifactory in the following structure:

Here's a sample code how to do this with help of artifactory-publisher tool.
To run this code we'll need to install dependencies:
"dependencies": {
    "artifactory-publisher": "~1.0.0",
    "async": "^1.1.0",
    "q": "^1.4.1"

Also see the sample as Gist.

It's just an example when some processing is needed before determining the final url of files being published.

If you just need to publish a file then you can use the tool from command-line:
artifactory-publisher -f /path/to/local_file.ext -t https://artifacts.company.com/repo/file.ext -u user -p pwd

May 29, 2015

Setting up Artifactory as npm repository behind Apache

Recently I was struggling with Artifactory to make it work as our Npm repository. Here's some experience.

Scoped packages and encoded slash

Npm since version 2.0.0 supports scoped package. It's a great feature for mastering in-house components which should not be published publicly to npmjs.org. Technically it a prefix in package name `@myorg/` which can be easily associated with a registry.

For example we need to create an in-house Yeoman-generator and publish it for devs of our company. We can create a package with scoped name "@myorg/generator-webapp" (package.json):

Next we can associate the scope "@myorg" with a registry. For example we created an NPM repository in Artifactory with name "myorg-npm". By accessing the url "http://artifacts.mydomain.org/artifactory/api/npm/myorg-npm/auth/myorg" under an authenticated user we'll get from Artifactory settings for NPM configuration in .npmrc (~/.npmrc):


That's all in terms of NPM. But Artifactory needs more configuration.

Now npm CLI will use package name with encoded slash: "@myorg%2fgenerator-webapp". By default TomCat and Apache restrict encoded slashes in URLs.
Configuring Artifactory is described in the documentation. Actually we need to put parameter org.apache.tomcat.util.buf.UDecoder.ALLOW_ENCODED_SLASH=true into $ARTIFACTORY_HOME/tomcat/conf/catalina.properties file (for Artifactory 4.x)
or %ARTIFACTORY_HOME%\etc\artifactory.system.properties for Artifactory 3.x.
But it's not enough if we have Apache in front of Artifactory. It should be configured also.
There should be done two things:
  • AllowEncodedSlashed set to NoDecode (by default it's Off)
  • Added keyword `nocanon` for ProxyPass, it tells mod_proxy module not to canonicalize URLs
BTW without `canon` keyword Artifactory will get urls with encoded % symbol, so %2F in "/@myorg%2fgenerator-webapp" becomes %252F ("/"@myorg%252Fgenerator-webapp"").

VirtualHost config should look like:
<VirtualHost *:80>
    ServerName artifacts.mydomain.org
    AllowEncodedSlashes NoDecode
    ProxyPass / ajp://localhost:8022/ nocanon

Now we can publish our package without any additional parameters:
npm publish
To install package:
npm install @myorg/generator-webapp -g
Run our generator (it's a nice thing that Yeoman fully supports scoped packages):
yo @myorg/webapp

Removing "/artifactory" path

I wanted my repositories to be accessible on a custom domain (http://artifacts.mydomain.org). But by default Artifactory always expects to be accessed via /artifactory path. For example initially it listens on http://localhost:8081/artifactory. But even after we moved Artifactory behind Apache it still expects the path. The documentation describes how this path can be customize but says nothing about how to remove it completely.
For me using the path seems completely unnecessary.

So here's Apache configuration for proxy Artifactory without path:
<VirtualHost *:80>
    ServerName artifacts.mydomain.org
    AllowEncodedSlashes NoDecode
        Order deny,allow
        Allow from all

    ProxyPreserveHost On
    ProxyPassReverseCookiePath /artifactory/ /
    ProxyPass / ajp://localhost:8022/artifactory/ nocanon
    ProxyPassReverse / http://artifacts.mydomain.org/artifactory/

    RewriteEngine On
    RewriteCond %{HTTP_HOST} ^artifacts\.mydomain\.org$ [NC]
    RewriteRule ^/artifactory/(.*)$ /$1 [L,R=301]

For using RewriteEngine we need to load mod_rewrite module.

Do not forget to change registry url in .npmrc ("//artifacts.mydomain.org/artifactory/api/" -> "//artifacts.mydomain.org/api/").