Page 1 of 1

How to count pdf color or bw pages using php except imagemagic?

Posted: 2017-07-11T00:13:17-07:00
by gopsmaheta99
I have one function to first conver each pdf page as image and then check into image that is colored or not .

So below executed command takes time because it in for loop and it cause server hang or takes it down.

So i want to found alternative way for it.


Code: Select all

function getPDFInfo($file, $product_id) {
		try {
			$inputFile = $file;

			$product = Product::find($product_id);
			/*exec('G:\xampp\htdocs\pdf\bin\C#_ParsingTest.exe C:\Users\drindia\Desktop\pdf.pdf',$fill);
				print_r($fill);
				die;*/
				echo convert(memory_get_usage(true));

			if(extension_loaded('imagick')) {
				
				$imagick = new Imagick(); 
				$imagick->pingImage($inputFile);
				$number_page = $imagick->getNumberImages();

				$imagick2 = new Imagick();
				$grey = 0;
				$color = 0;

				if($product->black_page_price == $product->color_page_price) {
					$grey = $number_page;
					$color = 0;
				} else {
					for($i=0;$i<$number_page;$i++) {
						$result = $imagick2->readImage($inputFile.'['.$i.']'); 
						$filename = "UBQhklw64WO8AuVcQzzGkgZoZcTHisvq_".$i.'.jpg';
						$imagick2->setImageFormat("jpg");
						$imagick2->writeImage('./public/frontend/tempimages/'.$filename);
						$result = exec('convert ./public/frontend/tempimages/'.$filename.' -colorspace HSL -channel g -separate +channel -format "%[fx:mean]" info:');
						//$result = exec('gs  -o - -sDEVICE=inkcov color-or-grayscale-test.pdf');
						
						if ($result > 0) {
							$color++;
						} else {
							//echo "Page Number : ".$i."Color ".$imageType."<br>";
							$grey++;
						}
						$imagick2->clear();
						$imagick2->destroy();

						File::delete('./public/frontend/tempimages/'.$filename);
					}
				}

				$imagick->clear();
				$imagick->destroy();
				return ['grey' => $grey, 'color' => $color, 'total_pages' => $number_page];
			} else{
				echo "Error";
				return ['grey' => '0', 'color' => '0', 'total_pages' => '0'];
			}
		} catch (Exception $e) {
			return ['grey' => '0', 'color' => '0', 'total_pages' => '0'];
		}
	}

Re: How to count pdf color or bw pages using php except imagemagic?

Posted: 2017-07-11T07:56:30-07:00
by snibgo
You are getting the saturation channel of a HSL version of each page.

Okay. To do that, you loop through the pages. For each page, you read it, then you save this as a jpeg, then you read the jpeg, convert it to colorspace HSL, separate the G channel, calculate the mean, and delete the jpeg.

You can do all that much more simply, without writing any files:

Code: Select all

convert zebrax.pdf -colorspace HSL -format "%[fx:mean.g] " info:
This gives you a space-separated list of the mean saturations. You could use a comma or \n to separate the elements if you want.

Re: How to count pdf color or bw pages using php except imagemagic?

Posted: 2017-07-12T00:54:25-07:00
by gopsmaheta99
Thanks for reply.
I got your point but suppose i have 5 page pdf then how can i check how many pages are color and how many are black& white.??
I want separate value for each page.
Can you help me with this because i stuck on this issue since last month.
I have 5000 page pdf and it runs on server it crash the server.
So please it would be nice if you can help me with this.
thanks

Re: How to count pdf color or bw pages using php except imagemagic?

Posted: 2017-07-12T01:23:35-07:00
by snibgo
The command I showed you provides one value per page. Your PHP script would then loop through those values.

For large PDF files that won't fit into memory, you will need to loop through and convert each page separately, myfile.pdf[0], myfile.pdf[1] etc. But there is no need to write and read a jpeg for each page, and no need to "-separate" for each page.

Re: How to count pdf color or bw pages using php except imagemagic?

Posted: 2017-07-12T01:37:11-07:00
by gopsmaheta99
Thank you again to helping me out.
I got that no need to convert each page of pdf into image but suppose i have 10 page pdf how can i get value for separate page.
For pdf page 1 -> color
pdf page 2 -> b/w
pdf page 3 -> color

hence 2 color page and one b/w page.

Code: Select all

$result = $imagick2->readImage($inputFile.'['.$i.']'); 
						$filename = "UBQhklw64WO8AuVcQzzGkgZoZcTHisvq_".$i.'.pdf';
						
						$result = exec('convert ./public/frontend/tempimages/'.$filename.' -colorspace HSL  -format "%[fx:mean]" info:');
i don't get it actually how to separate value from each pdf page.

Re: How to count pdf color or bw pages using php except imagemagic?

Posted: 2017-07-17T04:23:45-07:00
by gopsmaheta99
i found this one and its working fine but the count of color pages and b/w pages are not showing proper comparing to convert command used for separate pdf pages to image and checking into image.
Can anyone help me ?

Code: Select all

$result = exec('convert ./uploads/images/temporders/'.$filename.' -colorspace HSL -format "%[fx:mean.g] " info:',$fill);
                                                
                                                $colors = $fill[0]; 
                                                $allcolors = explode(' ', $colors);
                                                foreach($allcolors as $c){
                                                   if ($c > 0) {
                                                       $color++;
                                                   } else {
                                                       $grey++;
                                                   }   
                                                }